1999 Book SecureInternetProgramming
1999 Book SecureInternetProgramming
Secure
Internet Programming
Security Issues
for Mobile and Distributed Objects
^M Springer
Series Editors
Gerhard Goos, Karlsruhe University, Germany
Juris Hartmanis, Cornell University, NY, USA
Jan van Leeuwen, Utrecht University, The Netherlands
Volume Editors
Jan Vitek
University of Geneva, Object Systems Group
CH-1211 Geneva 4, Switzerland
E-mail: [email protected]
Christian D. Jensen
Trinity College Dublin, Department of Computer Science
O'Reilly Institute, Dublin 2, Ireland
E-mail: [email protected]
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9,1965,
in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are
liable for prosecution under the German Copyright Law.
Overview
The book is organised in three parts: (I) Foundations, (II) Concepts, and (III) Imple-
mentation, followed by an appendix.
Part I of the book contains chapters giving background and dealing with funda-
mental issues in trust, programming, and mobile computations in large scale open dis-
tributed systems. The paper by Swarup and F^brega is a comprehensive study of trust in
open distributed systems. The paper by Abadi discusses abstractions for protection and
VI
the correctness of their implementation. The paper by Ancona et al. analyses the use
of reflection to integrate authorisation mechanisms into an object-oriented system. The
paper by Cardelli discusses the difficulties of computing with mobility and proposes a
unified framework to overcome these difficulties based on mobile computational am-
bients. The paper by Hennesy and Riely studies the type safety properties of mobile
code in an open distributed system. The paper by De Nicola et al. describes the secu-
rity mechanisms of the programming language KLAIM, that is used to program mobile
agents. The paper by Leroy and Rouaix formulates and proves security properties that
well-typed applets possess and identifies sufficient conditions for the execution envi-
ronment to be safe.
Part II contains descriptions of general concepts in security in open distributed sys-
tems. The paper by Blaze et al. describes the use of trust management engines to avoid
the need to resolve "identities" in an authorisation decision. The paper by Aura dis-
cusses the advantages and limitations of delegation certificates in access control mech-
anisms. The paper by Brose proposes a new fine grained access control model for
CORBA based on views. The paper by Tschudin introduces the concept of apopto-
sis (programmed death) in mobile code based services. The paper by Yee discusses the
problem of ensuring confidentiality and integrity for mobile agents. The paper by Roth
describes how two co-operating agents, executing on different machines, can be used
to protect the confidentiality and integrity of the individual agent against tampering.
Part III contains papers detailing implementations of security concepts in open dis-
tributed systems. Most of the papers in this part also introduce new security concepts,
but devote a large portion to a particular implementation of these concepts. The paper
by Jaeger describes the use of role based access control policies in configurable sys-
tems and its implementation in the Lava Security Architecture. The paper by Grimm
and Bershad discusses secure execution of possibly untrusted extensions in the SPIN
extensible operating system. The paper by Jones describes a simple and efficient way
of interposing code between user programs and the underlying system's interface. The
paper by von Eicken et al. discusses the inadequacy of object references for access
control and describes an implementation of capabilities in the J-Kernel. The paper by
van Doom et al. describes the implementation of secure network objects, which is an
extension of Modula-3 network objects. The paper by Edjlali et al. describes an ac-
cess control mechanism in which access is granted based on the history of interactions
with the requesting principal. The paper by Alexander et al. discusses security issues
in active networks, and the solutions that have been implemented in the Secure Ac-
tive Network Environment. The paper by Hulaas et al. discusses the problem of mobile
agent interactions in an open environment. The paper by Wilhelm et al. describes how
trusted tamper resistant devices can be used to ensure the integrity of mobile agents.
Acknowledgements
We would like to thank the members of the programme committees of the two work-
shops. For EWDOS, they were: George Coulouris, Leendert van Doom, Li Gong,
Daniel Hagimont, Trent Jaeger. For MOS98, the committee included: Martin Abadi,
Brian Bershad, Ciaran Bryce, Luca Cardelli, Giuseppe Castagna, Robert Gray, Leila
VII
Ismail, Dag Johansen, Eric Jul, Doug Lea, Christian Tschudin, Dennis Volpano. Fur-
thermore, we wish to thank: Vinny Cahill, Daniel LeMetayer, Tommy Thorn, Hitesh
Tewari, and Mary Ellen Zurko for additional reviewing, and Alfred Hofmann from
Springer-Verlag for his support in getting the volume published.
I Foundations
II Concepts
IV Appendix
Foundations
Trust: Benefits, Models, and Mechanisms
1 Introduction
Trust is a fundamental concept in computer security. However, while many cen-
tral concepts in computer security such as privacy and integrity now have com-
monly accepted definitions, trust remains an ambiguous term. There are nu-
merous notions of trust in the computer security literature, including trust as
assurance in the correct and secure functioning of software, computer systems,
and legal systems; trust as belief in the benevolent, honest, competent, and pre-
dictable behavior of autonomous agents (human or software); and trust as a
tendency to depend on others. Different kinds of trust often satisfy different
properties; for instance, some kinds of trust are transitive (e.g., trust in the con-
text of network administration tools such as SATAN [II]) while others are not
(e.g., trust in the context of name-key binding mechanisms such as PGP).
Trust-based security architectures are widely deployed. For instance, in net-
work administration, trust is a relationship that exists between an entity man-
aging local resources and a remote client, whenever that client can access lo-
cal resources without local authentication or authorization (e.g., .rhost files in
Unix) [11]. In network security, a prevalent paradigm is to partition the network
into regions and to use firewalls to filter network packets between regions; the fil-
tering policies implemented by the firewalls are based on the trust relationships
between regions [16]. In distributed system security, participants use identity
certificates to identify each other; they need to trust other participants to is-
sue good certificates and to recommend reliable third parties for accomplishing
specific tasks, including signing certificates [14, 9]. In mobile agent security, a
mobile agent may be given access to resources at a host based on the host's
level of trust in the agent; this trust can depend on many factors including the
endorser of the agent, the hosts that the agent has visited in the past [12, 2],
and the access requests that the agent has made in the past [8].
Trust is important due to the many practical benefits it provides. For in-
stance, trust serves many important roles in agent societies: it is central to
agents engaging in cooperative activities; it provides an inexpensive (though of-
ten imprecise) basis for lowering expensive access barriers between agents; and
it enables agents to form clusters within which complex transactions have a high
likehhood of success. We need computational models of trust that describe how
trust between autonomous agents can be produced, manipulated, and degraded.
Agents can develop trust in each other by using mechanisms that implement
those models [21, 27]. This will enable software agents to trade in unfamiliar
environments with unfamiliar trading partners by relying on trust that is built
remotely across computer networks and appears essential for agents to play an
important role in distributed electronic commerce.
In this paper, we elaborate on the various distinctions on trust that are
mentioned above, drawing from a vast literature on trust in the social sciences
and in computer security. In Section 2, we present a classification of the various
meanings of trust. Although we consider the full range of meanings of trust, our
focus is on trust between agents and not on system trust or dispositional trust.
In Section 3, we motivate the importance of trust by describing its central role in
three significant areas: cooperation, lowering of access barriers, and clustering. In
Section 4, we describe how trust can be produced, manipulated, and degraded.
In Section 5, we summarize graph models of trust propagation that have been
proposed in the hterature. Section 6 concludes this paper.
2 Classification
System Trust: "The extent to which an entity believes that proper system
structures are in place to enable it to anticipate a successful future en-
deavor." [23] Safeguards such as regulations, guarantees, and security mecha-
nisms reduce the potential negative consequences of trusting behavior. Stabi-
lizing intermediaries (e.g., insurance companies) reduce uncertainty and risk
(e.g., financial risk). Most work in information systems security is aimed
at managing risk, thus increasing system trust and enabling entities to feel
more secure in trusting others.
Entity Trust: Entity trust includes all trust relationships between two or more
entities. These relationships can be further categorized as follows:
Trusting Belief: "The extent to which an entity beheves that another en-
tity is willing and able to act in the former's best interests." [23] The
belief can be based on a variety of attributes including the other's benevo-
lence, honesty, competence, and predictabihty. Assessing these attributes
directly is an inherently subjective process and is typically handled by
procedural mechanisms.
As an example, consider public key certification schemes (e.g., PGP and
X.509) that use identify certificates to bind keys to names. The schemes
include computational trust models [14,10] that enable the holder of a set
of certificates to decide whether to believe a particular name-key binding.
This decision may be based on the holder's belief in the trustworthiness
of the issuers of the certificates. We will examine several such models in
Section 5.
Trusting Intention: "The extent to which an entity is willing to depend
on another entity in a given situation with a feeling of relative security,
even though negative consequences are possible." [23] For instance, if a
trust metric for the PGP trust model yields some trust value for the
issuer of an identity certificate, the holder of the certificate must decide
whether the trust value is sufficient for it to believe the name-key binding
stated in the certificate. This decision function represents the trusting
intention of the holder.
Trusting Behavior: "The extent to which an entity depends on another
entity in a given situation with a feeling of relative security, even though
negative consequences are possible." [23]. For instance, a .rhost file on a
Unix host represents unconditional trusting behavior. If a user joe on
host orion creates a .rhost file that contains the name of host p l u t o ,
then any user authenticated as joe on p l u t o can access joe's account
on orion without further authentication. Thus joe must trust p l u t o
to enforce appropriate security safeguards to protect joe's account on
orion.
Trusting belief, intention, and behavior are described as relations between
two entities. However, these same notions can describe trust relationships
between multiple entities. McKnight and Chervany [23] identify the following
kind of multi-entity trust as important and prevalent:
Situational Decision to Trust: "The extent to which an entity intends
to depend on other entities in a given situation." [23] For instance, a
user may intend to depend on the system administrators (to recover her
files) in the event that her files get corrupted.
Dispositional Trust: "The extent to which an entity has a consistent ten-
dency to trust across a broad spectrum of situations and entities." [23] This
kind of trust may be viewed as a policy or strategy which produces trusting
intentions. It could arise because the entity believes others to be trustworthy
in general, or because it expects to obtain better outcomes by trusting others
irrespective of whether they are trustworthy. In computational trust models,
dispositional trust is manifested by threshold values. For instance, an entity
may enter a dependency relationship with another entity if its trusting belief
expectation in that entity exceeds its dispositional trust threshold.
3 Benefits
System trust plays an important role in enabling and sustaining cooperation be-
tween agents. Entity trust between two agents reduces the costs of identification
and access control between the agents, while entity trust among multiple agents
reduces the costs of complex multi-agent transactions. These benefits underscore
the importance of trust in distributed agent systems.
4 Operations
Entity trust between autonomous agents can be produced by operations such
as multiple successful interactions, manipulated by operations such as propaga-
tion and substitution, and degraded by operations such as natural decay and
malicious attacks.
4.1 Production
which are based on rational thought. In agent systems, social trust appears
in models where one's trust in an agent is based on one's social trust in the
principals that have signed (and hence endorsed) the agent.
Current computational trust models assume that direct trust valuations are
given as input and they describe how to compute indirect trust (see Section 4.2).
Establishing and assessing direct trust is considered to be subjective and is
outside the scope of these models. By contrast, agent-based systems cannot rely
on manual trust valuations and need technical mechanisms for producing trust
directly.
4.2 Manipulation
An agent may not be able to establish trust in another agent directly. The agent
can then use mechanisms that obviate the need to establish the trust [26], or
mechanisms to establish trust indirectly (that is, involving other "third-party"
agents).
Trust substitution mechanisms alter the trust relationships that would oth-
erwise be required between agents. These include technical solutions that
eliminate one's need to trust an agent, substituting it with trust in a third
party instead. For instance, in the physical world, the trustworthiness of a
tamper-resistant smartcard depends on the manufacturer of the card and
is independent of the entity that physically hosts the card. If the card is
used for electronic commerce, then participants in transactions need only
trust the manufacturer of the card and not necessarily its owner. Mobile
cryptography [30] provides similar fimctionality for software agents.
Trust propagation mechanisms propagate trust from one principal to another.
This kind of trust is known a^ indirect trust. For instance, professional or-
ganizations and recommendation services impart trust by virtue of their
reputations, and insurance companies impart trust by increasing the (finan-
cial) predictability of outcomes. In all these cases, trust flows from reputable
institutions to other principals.
Propagation mechanisms typically combine different trust valuations from
different principals into new trust valuations. Certain kinds of trust are
transitive while others are not. Trusting behavior (see Section 2) is often
transitive. For instance, if a UNIX host venus has a trusting relationship
with host p l u t o via a .rhost file, and p l u t o has a similar relationship with
host orion, then venus also has a trusting behavior relationship with orion.
Trusting beliefs are often not transitive. For instance, if Nisha trusts Ignacio
to make good recommendations, and Ignacio trusts Williaan to make good
recommendations, then Nisha does not necessarily trust William. Most work
on formal computational trust models deals with non-transitive propagation
trust; the aim of these models is to address the name-key binding problem
of identification and authentication. Section 5 presents a summary of such
models.
11
4.3 Degradation
5 Models
The social science literature contains numerous models of trust production while
the computer security literature contains numerous models of trust propagation.
Trust production is a subjective process for which adequate computational mod-
els have yet to be developed. Thus, in this section, we examine formal models of
trust propagation only and we describe some of their limitations. The axiomatic
(or mathematical) models of trust discussed here are graph models in the sense
that the mathematical structure used to model trust is a finite directed graph;
we do not consider models based on belief logics [25, 7, 1, 19].
In graph models, the nodes of a graph represent principals and the directed
edges of a graph represent trust relationships between principals. There may
be additional structure associated with a grai)h, such as labels on the nodes
and edges. The graphs are used to pioj^agate beliefs about statements (e.g.,
"the key P belongs to Nisha"). The trust graphs are extended with nodes that
represent statements and with edges from principal nodes to statement nodes
that represent certificates (i.e., signed statements). These axiomatic models are
useful to the extent they ahow deduction and calculation using unambiguous
rules. A given axiomatic model still needs justification to determine whether it
says anything useful about the real world.
1. The graph consists of two disjoint sets of nodes: A set of principals (keys)
and a set of statements (e.g., name-key bindings).
2. Directed edges are of two kinds:
— An arrow fc —>fc'means k signs a certificate saying "k trusts k' to issue
certificates with valid statements only". The trust represented by this
edge has been referred to in the hterature as recommendation trust ([3]).
There is a similar concept in [20] referred to as delegation.
12
Nisha
Figure 1 contains a simple trust graph in which there are seven principals
A,B,C,D,E,F,H and three statements P: Nisha, Q: Ignacio, R: William. The
statement P: Nisha asserts that the key P belongs to Nisha. The edge C ~»
P: Nisha represents a certificate issued by C that says P: Nisha. The edge B ^ C
represents a certificate issued by B that says "B trusts C to issue certificates
with valid statements only".
Deflnition 2. Let Q he the set of trust graphs. A trust decision rule (3 associated
to Q is a boolean function {G,k,s) i-> 0{G,k,s) defined whenever G is a graph
in Q, k is a key in G, and s is a statement in G.
>:Nisha
Q:Ignacio
R:William
Accept P: Nisha
B C
D.
7^
P:Nisha
H
E Q:Ignacio
RiWilliam
statement s. Thus, according to this rule, the keys A,B,D are the only ones to
accept the statement P: Nisha. Moreover this rule rejects ail other statements.
Notice that under this trust decision rule, nodes C, F, H in the example should
not accept any statements, including the statements they certify; these nodes
may accept statements based on other models of trust such as a model of direct
trust. The hterature describes several trust decision rules including rules where
/3(G, k, s) = t r u e only when there is a path from k to s whose length is smaller
than some threshold [33, 14], or when the number of bounded node-disjoint paths
from k to s exceeds some threshold [28].
Notice that a trust decision rule depends only on the graph and is indepen-
dent of the trust state. Moreover, as defined above, the value of /3(G, k, s) may
depend on the structure of the entire graph, including parts of the graph preced-
ing the node k along an oriented path. However, a key k should decide trust not
on who trusts k but on whom k trusts. To make this notion precise we define a
trust "interval": Given a statement node s and a key node k of G, G[k, s] is the
set of nodes that lie on some oriented path starting at k and ending at s. Notice
that k, s are both nodes in G[k,s]. Moreover, s is the only statement node in
G[k,s\.
14
Accept P:Nisha
B
B [0. 3,3] C
[i.o,2]y<V ^^[0.7,4]
io.s.i.sfv
U).5,2j ^ * P:Nisha
\ . [0.5,1.5]/^
[1.0,1.5]^^/''^ ^*^»--^[0.6,2]
H
• v [0.6,1 / . - * * Q:Ignacio
[0.3,3rV ' y{^A
F - ^ ^ ^ R:Wimam
valid certificates but not to make good recommendations. Some models assign
values to trust lelationsiiips based on vaiious presumed properties of trust such
as likelihood of failure (of trusting behavior), cost of failure, and an aggregation
from assessments of third parties. The notions of trust state, trust decision rule,
trust attack, and false positive rate apply to these valued graph models as well
and we consider them below.
Network Flow Models. Figure 5 contains a valued trust graph in which each
edge is labeled by a pair of numbers that represents the flow capacity and the
cost of the edge. Trust decision rules include rules where fi{G, k, s) = t r u e only
when the cost per unit flow from k to s is smaller than some threshold, or
when the maximum network flow from k to s exceeds some threshold [20]. Note
that with unit costs and capacities, these metrics reduce to shortest path and
edge-disjoint path metrics for simple trust graphs.
6 Conclusion
Trust plays a critical role in several aspects of agent societies. First, cooperation
between agents depends on their trust in the underlying computer systems and
software. Second, cooperative transactions can incur substantial costs due to ac-
cess barriers between agents; trust between agents enables them to lower these
access barriers thus reducing transaction costs. Finally, trust can lead to the
clustering of agents into communities within which complex multi-agent trans-
actions have a high likelihood of success.
Trust is an overloaded term with numerous meanings, each with distinct
properties. It can be established directly between two or more agents in a variety
of ways. It can be propagated among a network of agents, and one form of trust
can be substituted by other forms. Finally, it can be degraded in several ways.
This paper illustrates the complex nature of trust and the numerous problems
that remain to be solved before software agents can employ computational trust
models.
References
[1] M. Abadi, M. Burrows, B. Lampson, and G, Plotkin. A calculus for access con-
trol in distributed systems. ACM Transactions on Programnnng Languages and
Systems, 15(4):706-734, October 1993.
[2] S. Berkovits, J. D. Guttman, and V. Swarup. Authentication for mobile agents.
Lecture Notes in Computer Science 1419, Special issue on Mobile Agents and
Security, 1998.
[3] T. Beth, M. Borcherding, and B. Klein. Valuation of trust in open networks.
In D. GoUman, editor. Proceedings of the European Symposium on Research in
Computer Security (ESORICS), LNCS 875, pages 3-18. Springer Verlag, 1994.
[4] A. Birrell, B. Lampson, R. Needham, and M. Shroeder. A global authentication
service without global trust. In Proceedings of the IEEE Symposium on Security
and Privacy, pages 223-230, 1986.
[5] M. Blaze, J. Feigenbaum, and J. Lacy. Decentralized trust management. In
Proceedings of the IEEE Symposium on Security and Privacy, pages 164-173,
1996,
[6] B. Borcherding and M. Borcherding. Covered trust values in distributed systems.
In Proceedings of the Working Conference on Multimedia and Communication
Security, pages 24-31. Chapman & Hall, 1995.
[7] Michael Burrows, Martin Abadi, and Roger Needham. A logic of authentication.
Proceedings of the Royal Society, Series A, 426(1871):233-271, December 1989.
Also appeared as SRC Research Report 39 and, in a shortened form, in ACM
Transactions on Computer Systems 8, 1 (February 1990), 18-36.
[8] G. Edjlali, A. Acharya, and V, Chaudhary. History-based access control for mobile
code. In Proceedings of the ACM Conference on Computer and Communications
Security, 1998.
[9] C. M. Ellison, B. Frantz, B. Lampson, R. R.ivest, B. M. Thomas, and T. Ylonen.
Simple public key certificate. Internet Draft (Work in Progress), November 1998.
[10] C. M. Ellison, B. Frantz, B. Lampson, R,. Rivest, B. M. Thomas, and T. Ylonen.
SPKI certificate theory. Internet Draft (Work in Progress), November 1998.
[11] D. Farmer and W. Venema. SATAN Overview, 1995. http://www.fish.com/.
[12] W. M. Farmer, J. D. Guttman, and V. Swarup. Security for mobile agents:
Authentication and state appraisal. In Proceedings of the European Symposium
on Research in Computer Security (ESORICS), LNCS 1146, pages 118-130, 1996.
[13] F. Fukuyama. TYust : The Social Virtues and the Creation of Prosperity. Free
Press, June 1996.
[14] S. Garfinkel. PGP: Pretty Good Privacy. O'Reilly and Associates, 1994.
[15] E. Gerck. Towards a real-world model of trust: reliance on received information.
MCG, 1998. http://www.mcg.org.br/trustdef.htm.
[16] J. D. Guttman. Filtering postures: Local enforcement for global pohcies. In
Proceedings of the IEEE Sym,posium on Security and Privacy, 1997.
[17] A. Josang. A model for trust in security systems. In Proceedings of the Second
Nordic Workshop on Secure Computer Systems, 1997.
[18] R. M. Kramer and T. R. Tyler, editors. Trust in Organizations : Frontiers of
Theory and Research. Sage Publications, February 1996.
[19] B. Lampson, M. Abadi, M. Burrows, and E. Wobber. Authentication in dis-
tributed systems: Theory and practice. ACM Transactions on Computer Systems,
10(4):265-310, November 1992.
18
[20] R. Levien and A. Aiken. Attack-resistant trust metrics for public key certification.
In Proceedings of the 7th USENIX Security Symposium, 1998.
[21] S.P. Marsh. FormoMsing Trust as a Computational Concept. PhD thesis, De-
partment of Computer Science and Mathematics, University of Sterhng, April
1994.
[22] U. Maurer. Modeling a public-key infrastructure. In Proceedings of the European
Symposium on Research in Computer Security (ESORICS), LNCS 1146, pages
118-130. Springer Verlag, 1996.
[23] D.H. McKnight and N.L. Chervany. The meanings of trust. Working pa-
per, Carlson School of Management, University of Minnesota, 1996. h t t p : / /
www. misrc. uiim. edu/wpaper/wp96-04. htm.
[24] B. A. Misztal. Trust m Modern Societies : The Search for the Bases of Social
Order. Polity Press, December 1995.
[25] P. Venkat Rangan. An axiomatic bsisis of trust in distributed systems. In Pro-
ceedings of the IEEE Symposium on Security and Privacy, pages 204-210, 1988.
[26] J.M. Reagle. Trust in a cryptographic economy and digital security deposits:
Protocols and policies. Master's thesis. Technology and Policj' Program, Mas-
sachusetts Institute of Technology, May 1996.
[27] J.M. Reagle. Trust in electronic markets: The convergence of cryptographers and
economists. First Monday, 1(2), August 1996. http://www.firstmonday.dk/
issues/issue2/markets/index.html.
[28] M. K. Reiter and S. G. Stubblebine. Path independence for authentication in
large-scale systems. In Proceedings of the 4th ACM Conference on Computer and
Communications Security, pages 57-66, 1997.
[29] M. K. Reiter and S. G. Stubblebine. Toward acceptable metrics of authentication.
In Proceedings of the IEEE Symposium on Security and Privacy, pages 3-18, 1997.
[30] T. Sander and C. Tschudin. Towards mobile cryptography. In Proceedings of the
IEEE Symposium on Security and Privacy, 1998.
[31] A. Tarah and C. Huitema. Associating metrics to certification paths. In Proceed-
ings of the European Symposivjn on Research in Computer Security (ESORICS),
LNCS 648, pages 175-189. Springer Verlag, 1992.
[32] Bernard Wilhams. Formal structures and social reality. In D. Gambetta, editor,
Trust: Making and Breaking Cooperative Relations, pages 3-13. Basil Blackwell,
1988.
[33] R. Yahalom, B. Klein, and Th. Beth. Trust relationships in secure systems - a
distributed authentication perspective. In Proceedings of the IEEE Symposium
on Security and Privacy, 1993.
[34] R. Yahalom, B. Klein, and Th. Beth. Trust-based navigation in distributed sys-
tems. Computing Systems, 7(l):45-73, 1994.
Protection in
Programming-Language Translations*
Martin Abadi
1 Introduction
Tangible crimes and measures against those crimes are sometimes explained
through abstract models—with mixed results, as the detective Erik Lonnrot
discovered [9]. Protection in computer systems relies on abstractions too. For
example, an access matrix is a high-level specification t h a t describes the allowed
accesses of subjects to objects in a computer system; the system may rely on
mechanisms such as access lists and capabilities for implementing an access ma-
trix [22].
Abstractions are often embodied in programming-language constructs. Re-
cent work on Java [17] has popularized the idea t h a t languages are relevant
t o security, but the relation between languages and security is much older. In
particular, objects and types have long been used for protection against incompe-
tence and malice, a t least since the 1970s [36, 24, 20]. In the realm of distributed
systems, programming languages (or their libraries) have sometimes provided ab-
stractions for communication on secure channels of the kind implemented with
cryptography [7,49,47,50,46].
Security depends not only on the design of clear and expressive abstractions
but also on the correctness of their implementations. Unfortunately, the crite-
ria for correctness are rarely stated precisely—and presumably they are rarely
met. These criteria seem particularly delicate when a principal relies on those
abstractions but interacts with other principals at a lower level. For example,
the principal may express its programs and policies in terms of objects and re-
mote method invocations, but may send and receive bit strings. Moreover, t h e
bit strings t h a t it receives may not have been the o u t p u t of software t r u s t e d to
respect the abstractions. Such situations seem to be more common now t h a n in
the 1970s.
One of the difficulties in the correct implementation of secure systems is
t h a t the s t a n d a r d notion of refinement (e.g., [19,21]) does not preserve security
sC {
private int x;
public void set_x(int) {
.framelimits l o c a l s = 2, s tack 2;
aload_0; II load this
iload-1; II load V
p u t f i e l d x; II s e t X
};
}
— In our example, the translation retains the qualifier private for x. The oc-
currence of this qualifier at the JVML level may not be surprising, but it
cannot be taken for granted. (At the JVML level, the qualifier does not have
the benefit of helping programmers adhere to sound software-engineering
practices, since programmers hardly ever write JVML, so the qualifier might
have been omitted from JVML.)
23
class D {
private int x;
public void set_x(int v) {
t h i s . x = v;
};
static int get_x(D d) {
r e t u r n d.x;
};
}
class E {
. . . get-jc . ..
}
Here E is moved to the top level. A method get_x is added to D and used in E
for reading x; the details of E do not matter for our purposes. The method get_x
can be used not just in E, however—any other class within the same package
may refer to get jx.
When the classes D and E are compiled to JVML, therefore, a JVML context
may be able to read x in a way that was not possible at the Java level. This pos-
sibihty results in the loss of full abstraction, since there is a JVML context that
distinguishes objects that could not be distinguished by any Java context. More
precisely, a JVML context that runs get_x and returns the result distinguishes
instances of D with different values for x.
This loss of full abstraction may result in the leak of some sensitive infor-
mation, if any was stored in the field x. The leak of the contents of a private
component of an object can be a concern when the object is part of the Java
Virtual Machine, or when it is trusted by the Java Virtual Machine (for exam-
ple, because a trusted principal digitally signed the object's class). On the other
hand, when the object is part of an applet, this leak should not be surprising:
applets cannot usually be protected from their execution environments.
For better or for worse, the Java security story is more complicated and
dynamic than the discussion above might suggest. In addition to protection by
the qualifier private, Java has a default mode of protection that protects classes
in one package against classes in other packages. At the language level, this mode
of protection is void—any class can claim to belong to any package. However,
Java class loaders can treat certain packages in special ways, guaranteeing that
only trusted classes belong to them. Our example with inner classes does not
pose a security problem as long as D and E are in one of those packages.
In hindsight, it is not clear whether one should base any security expectations
on qualifiers like private, and more generally on other Java constructs. As Dean
et al. have argued [12], the definition of Java is weaker than it should be from a
security viewpoint. Although it would be prudent to strengthen that definition,
25
The formal setting for this section is the pi calculus [32,34,33], which serves
as a core calculus with primitives for creating and using channels. By applying
the pi calculus restriction operator, these channels can be made private. We dis-
cuss the problem of mapping the pi calculus to a lower-level calculus, the spi
calculus [4,5,3], implementing communication on private channels by encrypted
communication on public channels. Several low-level attacks can be cast as coun-
terexamples to the full abstraction of this mapping. Some of the attacks can be
thwarted through techniques common in the literature on protocol design [29].
Some other attacks suggest fundamental difficulties in achieving full abstraction
for the pi calculus.
First we briefly review the spi calculus. In the variant that we consider here,
the syntax of this calculus assumes an infinite set of names and an infinite set
of variables. We let c, d, m, n, and p range over names, and let w, x, y, and z
range over variables. We usually assume that all these names and variables are
different (for example, that m and n are different names). The set of terms of
the spi calculus is defined by the following grammar:
L,M,N::= terms
n name
X variable
{ M l , . . . ,M(fc}jv encryption (A; > 0)
26
An output process M{Ni,... ,Nk) sends the tuple Ni,... ,Nk on M. An input
process M(a:i,... ,Xk)-Q is ready to input k terms Ni,... ,Nk on M, and then
to behave as Q[N\/xi,... ,Ni./xk\- Here we write Q[Ni/xi,... ,Nk/xk] for the
result of replacing each free occurrence of xt in Q with Ni, for i 6 l..k. Both
M{xi,... ,Xk)-Q and case L of {xi,... ,Xk]}\/ in P (explained below) bind the
variables xi, ..., x^. The nil process 0 does nothing. A composition P \ Q
behaves as P and Q running in parallel. A replication !P behaves as infinitely
many copies of P running in parallel. A restriction (i/n)P makes a new name
n and then behaves as P; it binds the name n. A match process [M is N] P
behaves as P if M and TV are equal; otherwise it does nothing. A decryption
process case L of [xi,... ,Xk}N in P attempts to decrypt L with the key TV; if L
has the form { M i , . . . , M^JN, then the process behaves as P [ M i / 2 : i , . . . , Mk/xk];
otherwise it does nothing.
By omitting the constructs { M i , . . . ,Mfc}jv and case L of {xi,... ,Xk}N in P
from these grammars, we obtain the syntax of the pi calculus (more precisely, of
a polyadic, asynchronous version of the pi calculus).
As a first example, we consider the trivial pi calculus process:
{i/n){n{m) \ n(x).O)
This is a process that creates a channel n, then uses it for transmitting the name
m, with no further consequence. Communication on n is secure in the sense that
no context can discover m by interacting with this process, and no context can
cause a different message to be sent on n; these are typical secrecy and integrity
properties. Such properties can be expressed as equivalences (in particular, as
testing equivalences [11,8,3]). For example, we may express the secrecy of m as
the equivalence between {i'n){n{m) \ n(x).O) and {i'n){n(m') \ n{x).0), for any
names m and m'.
Intuitively, the subprocesses n(m) and n{x).0 may execute on different ma-
chines; the network between these machines may not be physically secure. There-
fore, we would like to explicate a channel like n in lower-level terms, mapping it
27
{i'Ti){n(m) I n{x).x{))
which is a small variant of the first example where, after its receipt, the message
m is used for sending an empty message. This process preserves the integrity
of m, in the sense that no other name can be received and used instead of m;
therefore, this process is equivalent to m().
The obvious spi calculus implementations of {i'n){n{Tn) \ n{x).x{)) and m()
are respectively
/c({mi}„) I c({m2}n) I \
(i/n) c{y).case y of {a;}„ in c{{}x) \ 1
\c{y).case y of {x}„ in c{{}x) J
( c{zi).c{{mi,Zi]n) I c{z2).c{{m2,Z2]n) \
protocol is rather simplistic in that the challenges may get "crossed" and then
neither mi nor m2 would be transmitted successfully; it is a simple matter of
programming to protect against this confusion. In any case, for each challenge,
at most one message is accepted under n. This use of challenges thwarts replay
attacks.
will not even discover whether mi —1712. (For this example, we drop the impHcit
assumption that mi and m2 are different names.) On the other hand, suppose
that we translate this process to:
( c({mi}„) I c({m2}„) I
c{y).case y of {a;}„ in 0 |
c{y).case y of {x}n in 0 J
"^
This process reads and relays two messages on the channel c, and emits a message
on the channel d if the two messages are equal. It therefore distinguishes whether
mi = m2. The importance of this sort of leak depends on circumstances. In an
extreme case, one cleartext may have been guessed (for example, the cleartext
"attack at dawn"); knowing that another message contains the same cleartext
may then be significant.
A simple countermeasure consists in including a different confounder com-
ponent in each encrypted message. In this example, the implementation would
become:
I {'^Pi.)c{{'m-uPl]n) I {l'P2)c{{m2,p2]n) \\
{vn) c{y).case y of {x,zi}n in 0 \
\ciij).case y of {x,Z2}„ in 0 J
The names pi and p2 are used only to differentiate the two messages being
transmitted. Their inclusion in those messages ensures that a comparison on
ciphertexts does not reveal an equality of cleartexts.
may use n afterwards, but cannot recover the contents of the first message sent
on n. Therefore, this process is equivalent to
(i/n)(n(m') | n{x).p{n))
for any m'. Interestingly, this example relies crucially on scope extrusion, a
feature of the pi calculus not present in simpler calculi such as CCS [31].
A spi calculus implementation of {i'n){n{m) | n{x).p{n)) might be:
{i^n){c{{m]n) I c{y).case y of {x}n in c{{n]p))
However, this implementation lacks the forward-secrecy property [14]: the disclo-
sure of the key n compromises all data previously sent under n. More precisely,
a process may read messages on c and remember them, obtain n by decrypt-
ing {n}p, then use n for decrypting older messages on c. In particular, the spi
calculus process
c{x).{c{x) I c{y).case y of {z}p in case x of {w}^ in d(w))
may read and relay {m}„, read and decrypt {n}p, then go back to obtain m
from {m}„, and finally release m on the public channel d.
Full abstraction is lost, as with the other attacks; in this case, however, it
seems much harder to recover. Several solutions may be considered.
- We may restrict the pi calculus somehow, ruling out troublesome cases of
scope extrusion. It is not immediately clear whether enough expressiveness
for practical programming can be retained.
- We may add some constructs to the pi calculus, for example a construct that
given the name n of a channel will yield all previous messages sent on the
channel n. The addition of this construct will destroy the source-language
equivalence that was not preserved by the translation. On the other hand,
this construct seems fairly artificial.
- We may somehow indicate that source-language equivalences should not be
taken too seriously. In particular, we may reveal some aspects of the imple-
mentation, warning that forward secrecy may not hold. We may also spec-
ify which source-language properties are maintained in the implementation.
This solution is perhaps the most realistic one, although we do not yet know
how to write the necessary specifications in a precise and manageable form.
- Finally, we may try to strengthen the implementation. For example, we may
vary the key that corresponds to a pi calculus channel by, at each instant,
computing a new key by hashing the previous one. This approach is fairly
elaborate and expensive.
The problem of forward secrecy may be neatly avoided by shifting from the pi
calculus to the join calculus [15]. The join calculus separates the capabilities for
sending and receiving on a channel, and forbids the communication of the latter
capability. Because of this asymmetry, the join calculus is somewhat easier to
map to a lower-level calculus with cryptographic constructs. This mapping is the
subject of current work [2]; although still impractical, the translation obtained
is fully abstract.
31
With progress on security infrastructures and techniques, it may become less im-
portant for translations to approximate full abstraction. Instead, we may rely on
the intrinsic security properties of target-language code and on digital signatures
on this code. We may also rely on the security properties of source-language
code, but only when a precise specification asserts that translation preserves
those properties. Unfortunately, several caveats apply.
Acknowledgements
Most of the observations of this paper were made during joint work with Cedric
Fournet, Georges Gonthier, Andy Gordon, and Raymie Stata. Drew Dean, Mark
Lillibridge, and Dan Wallach helped by explaining various Java subtleties. Mike
Burrows, Cedric Fournet, Mark Lillibridge, John Mitchell, and Dan Wallach
suggested improvements to a draft. The title is derived from that of a paper by
Jim Morris [36].
References
20. Anita K. Jones and Barbara H. Liskov. A language extension for expressing con-
straints on data access. Communications of the ACM, 21(5):358-367, May 1978.
21. Leslie Lamport. A simple approach to specifying concurrent systems. Com.m.uni-
cations of the ACM, 32(l):32-45, January 1989.
22. Butler W. Lampson. Protection. In Proceedings of the 5th Princeton Conference
on Information Sciences and Systems, pages 437-443, 1971.
23. Butler W. Lampson. Hints for computer system design. Operating Systems Re-
view, 17(5):33-48, October 1983. Proceedings of the Ninth ACM Symposium on
Operating System Principles.
24. Butler W. Lampson and Howard E. Sturgis. Reflections on an operating system
design. Communications of the ACM, 19(5):251-265, May 1976.
25. Xavier Leroy and Frangois Rouaix. Security properties of typed applets. In Pro-
ceedings of the 25th ACM Symposium on Principles of Programming Languages,
pages 391-403, 1998.
26. Tim Lindholm and Frank Yellin. The Java Virtual Machine Specification. Addison-
Wesley, 1996.
27. John Longley and Gordon Plotkin. Logical full abstraction and PCF. In Jonathan
Ginzburg, Zurab Khasidashvili, Carl Vogel, Jean-Jacques Levy, and Enric Vallduvi,
editors, The Tbilisi Symposium on Logic, Language and Computation: Selected
Papers, pages 333-352. CSLI Pubhcations and FoLLI, 1998.
28. John McLean. A general theory of composition for a class of "possibilistic" prop-
erties. IEEE Transactions on Software Engineering, 22(l):53-66, January 1996.
29. Alfred J. Menezes, Paul C. van Oorschot, and Scott A. Vanstone. Handbook of
Applied Cryptography. CRC Press, 1996.
30. Robin Milner. Fully abstract models of typed A-calculi. Theoretical Computer
Science, 4:1-22, 1977.
31. Robin Milner. Communication and Concurrency. Prentice-Hall International,
1989.
32. Robin Milner. Functions as processes. Mathematical Structures in Computer Sci-
ence, 2:119-141, 1992.
33. Robin Milner. The polyadic 7r-calculus: a tutorial. In Bauer, Brauer, and Schwicht-
enberg, editors, Logic and Algebra of Specification. Springer-Verlag, 1993.
34. Robin Milner, Joachim Parrow, and David Walker. A calculus of mobile processes,
parts I and II. Information and Computation, 100:1-40 and 41-77, September
1992.
35. John C. Mitchell. On abstraction and the expressive power of programming lan-
guages. Science of Computer Programming, 21(2):141-163, October 1993.
36. James H. Morris, Jr. Protection in programming languages. Communications of
the ACM, 16(1):15-21, January 1973.
37. Greg Morrisett, David Walker, Karl Crary, and Neal Glew. Prom System F to
Typed Assembly Language. In Proceedings of the 25th ACM Symposium on Prin-
ciples of Programming Languages, pages 85-97, 1998.
38. Andrew C. Myers and Barbara Liskov. A decentralized model for information
flow control. In Proceedings of the 16th ACM Symposium on Operating System
Principles, pages 129-142, 1997.
39. George C. Necula and Peter Lee. The design and implementation of a certifying
compiler. In Proceedings of the ACM SIGPLAN'98 Conference on Programming
Language Design and Implementation (PLDI), pages 333-344, 1998.
40. Gordon Plotkin. LCF considered as a programming language. Theoretical Com-
puter Science, 5:223-256, 1977.
34
41. Zhenyu Qian. A formal specification of Java Virtual Machine instructions for
objects, methods and subroutines. In Jim Alves-Foss, editor, Formal Syntax and
Semantics of Java'^". Springer-Verlag, 1998. To appear.
42. Jon G. Riecke. Fully abstract translations between functional languages. Mathe-
matical Structures in Computer Science, 3(4):387-415, December 1993.
43. Ehud Shapiro. Separating concurrent languages with categories of language em-
beddings. In Proceedings of the Twenty Third Annual ACM Symposium on the
Theory of Computing, pages 198-208, 1991.
44. Raymie Stata and Martin Abadi. A type system for Java bytecode subroutines. In
Proceedings of the 25th ACM Symposium on Principles of Programming Languages,
pages 149-160, January 1998.
45. Sun Microsystems, Inc. Inner classes specification. Web pages at h t t p : / / j a v a
. s u n . c o m / p r o d u c t s / j d k / 1 . 1 / d o c s / g u i d e / i n n e r c l a s s e s / , 1997.
46. Sun Microsystems, Inc. RMI enhancements. Web pages at h t t p : / / j a v a
.sun. com/products/jdk/1.2/docs/guide/rmi/index.htnil, 1997.
47. Leendert van Doom, Martin Abadi, Mike Burrows, and Edward Webber. Secure
network objects. In Proceedings 1996 IEEE Symposium on Security and Privacy,
pages 211-221, May 1996.
48. Dennis Volpano, Cynthia Irvine, and Geoffrey Smith. A sound type system for
secure flow analysis. Journal of Computer Security, 4:167-187, 1996.
49. Edward Wobber, Martin Abadi, Michael Burrows, and Butler Lampson. Authen-
tication in the Taos operating system. ACM Transactions on Computer Systems,
12(l):3-32, February 1994.
50. Ann Wollrath, Roger Riggs, and Jim Waldo. A distributed object model for the
Java system. Computing Systems, 9(4):265-290, Fall 1996.
Reflective Authorization Systems:
Possibilities, Benefits, and Drawbacks
1 Introduction
Security implies not only protection from external intrusions but also control-
ling the actions of internally-executing entities and the operations of the whole
software system. In this case, the interleaving between operations and data pro-
tection may become very complicated and often intractable. For this reason, se-
curity must be specified and designed in a system from its early design steps [11].
Prom another point of view
— it is very important that the security mechanisms of the application be cor-
rect and stable;
— the security code should not be mixed with the application code, otherwise
it should be very hard to reuse well-proven implementations of the security
model.
If this is not done, when a new secure application is developed the designerjim-
plementer wastes time to re-implement and to test the security modules of the
application. Moreover, security is related to: "who is allowed to do what, where
and when"; so security is not functionally part of the solution of the application
problem, but an added feature defining constraints on object interactions. From
this last remark we can think of security as a feature operating at a different
computational level and we can separate its implementation from the applica-
tion implementation.
In our opinion it is possible to exploit some typical reflection features, like sep-
aration of concerns and transparency, to split a secure system into two levels:
at the first level there are (distributed) objects cooperating to solve the sys-
tem application; at the second one, rights and authorizations for such entities
are identified, specified and mapped onto reflective entities which transparently
36
monitor the objects of the first level and authorize the allowed access to other
objects, services, or information.
Working in this way it is possible to develop stable and reliable entities for han-
dling security. It is also possible to reuse them during system development, thus
reducing development time and costs, and increasing application level assurance.
In most systems, authorization is defined with respect to persistent data and en-
forced by the DBMS and|or operating system. Object-oriented systems define
everything as an object, some persistent some temporary, where this separation
is not visible at the application level. In these systems authorization must be
defined at the application level to take advantage of the semantic restrictions
of the information [10]. An early system (not object-oriented) (see [11], page
195), attempted this kind of control by defining programs that had predefined
and preauthorized accesses. Reflection appears as a good possibility for this type
of control because it does not separate persistent from temporary entities. The
Birlix operating system [18] used reflection to adapt its nonfunctional proper-
ties (including security) to different execution and application environments (for
more details on how satisfy nonfunctional requirements using reflection, see [21]).
In this paper we examine how to use a reflective architecture, such as those de-
scribed above, to manage the authorization aspects of an application and the
advantages and drawbacks of using such an approach.
2.1 Reflection
flexibility and modularity in the software system at tfie cost of meta-entity pro-
liferation. The reflective models considered here are: meta-ohject, and channel
reification.
Channel Reification Model. In the channel reification model [1, 2], one or
more objects, called channels are established between two interacting objects.
Each channel is characterized by a triple composed by the objects it connects
and by the kind of the meta-computation it performs.
A channel kind identifies the meta-behavior provided by the channel. The kind
is used to distinguish the reflective activity to be performed: several channels
(distinguishable by the kind) can be established between the same pair of objects
at the same moment.
Each channel persists after each meta-computation, and is reused when a com-
munication characterized by the same triple is generated. The features of the
model are: method-level granularity, information continuity, channel lazy cre-
ation, and global view^. Each service request of a specific kind is trapped (shift-
up action) by the channel of the corresponding kind connecting client and server
objects, if it exists. Otherwise, such a channel is created; in either case, it then
performs its meta-computation and transmits the service request to the supplier.
The server's answer is collected and returned to the requesting object (shift-down
action).
the matrix contains the permitted access type. Such a model can be reaUzed by
capability lists, access control lists, or combinations of these. Formally, an autho-
rization right R for a subject s to access an object o, through message (method)
m can be written R(s)=(m,o). Here m usually represents a high-level access type,
e.g., hire an employee.
A Role-Based Access Control (RBAC) model is particularly suitable for object-
oriented systems [17], because accesses can be defined as operations defined in
the objects according to the need-to-know of the roles [9]. Because reflection
supports a fine granularity of access, its combination with RBAC can be quite
effective.
We illustrate our ideas by using the following scenario: the system is com-
posed of several objects interacting in a chent-server manner. For security rea-
sons the services supplied by a server and its data are protected and pro-
hibited to some subjects. To support our presentation we use some stubs of
Java code. The code is tailored to the case of three clients and one server
supplying two methods, one reading the server state and the other modify-
ing it. Each client has different rights on the two supphed services. Java [3]
is not a reflective language, for our purposes we realize the reflection in a
naive way by emulating it with inheritance. Of course in this way transparency
and efficiency are compromised. However, we chose this approach for simplic-
ity and because it permits to point out the advantages and the weak points of
the reflective approach. In particular, it shows the necessity to implement the
context switch in a secure way. The complete code can be downloaded from
http://www.disi.unige.it/person/CazzolaW/OORSecurity.html.
mo encapsulates
o 's authorizations
meta-object mo
meta-level
base-level
subject s object o
In the meta-object model a meta-object is associated with each object {as shown
in Fig. 1); this meta-object encapsulates the related access control list of the
40
oa
j<l' lI'N Shell
permission = server.methodRead();
a request for the execution of method Read is trapped by the shell (step O in
Fig. 2) which calls the meta-object for obtaining the authorization (step ® in
Fig. 2). The meta-object replies to the shell (step © in Fig. 2) and if authorized,
allows the kernel to return the reply to the request. Otherwise, the shell notifies
the client that a violation occurred by sending it an error message (step 0 in
Fig. 2).
41
® p u b l i c b o o l e a n m e t h o d R e a d ( i n t IDClient) {
® b o o l e a n access = r e m _ o b j 2 . m e t a B e h a v i o r { M E T H O D _ R E A D , IDClient);
® if (access) r e t u r n s u p e r . m e t h o d R e a d ( l ) ;
® e l s e r e t u r n false;
® }
® p u b l i c b o o l e a n m e t h o d W r i t e ( i n t IDClient) {
® b o o l e a n access = r e m _ o b j 2 . m e t a B e ] i a v i o r ( M E T H O D _ W R l T E , IDClient);
O if (access) r e t u r n s u p e r . m e t h o d W r i t e ( l ) ;
@ e l s e r e t u r n false;
® }
O p u b l i c s t a t i c v o i d main(Str!ng args[]) {
© System.setSecurityManager(new RMISecurityManager()); / / RMI Registration
® }
This model is suitable for implementing highly specialized role rights (specialized
per object, per subject or per service request). Moreover, each meta-object may
hold the methods necessary to modify authorization rules of the base-objects [12],
which can be called for modifying active authorizations. In this way, the devel-
oper of base-objects does not need to worry about validity of authorization
updates, which are encapsulated into the meta-object behavior. Obviously our
example suffers of efficiency penalties due to the reflective implementation in
Java.
ChannelWrite
subject s object o
the remote method invocation in order to divert the request from the server to
the channel. The chent is composed of three layers (see Fig. 4). The inner layer
supplies the operations necessary to communicate with the server. The central
layer encapsulates all operations of the inner layer and uses them to communi-
cate with the right channel. The outer layer defines the behavior of the client.
The server code corresponds exactly to the kernel code of the previous approach.
There is no need to wrap it because the server receives requests that have been
previously filtered by the channels, it cannot be accessed directly by the clients.
The above splitting of responsibility of the validation mechanism, makes the
code of a channel simpler and more modular than that of the meta-object. Op-
erations channelRead and channelWrite are very similar, they differ only in the
encapsulated information and the use they make of it. In our example channels
and meta-objects are similar because the validation phase is simple; however, it
should be noted that the amount of information encapsulated and managed by
a single channel has decreased.
When a client issues the call (see Fig. 4):
permission = server.methodRead();
44
Central Layer
^Outer Layer
the control is dispatched to the central layer of the client (step O in Fig. 4) which
determines the right channel to be invoked on the basis of the request kind. The
request is then forwarded to the channel while the client idles until a reply is
sent back to it (step @ in Fig. 4). The channel validates the request and, if legal,
forwards it to the server (step @ in Fig. 4). When the channel receives the reply
from the server, it forwards it back to the client (step O in Fig. 4). The server
is unaware of the validation process: it executes only filtered requests. Thus a
good separation of concerns and implementation modularity is achieved.
Security models based on communication flow [5] can be easily modeled by the
channel reification model: in particular, models based on the concept of flow
such as those of Escort of Scout [19] and Corps [16]. A path is a first class object
encapsulating data flowing through a set of modules. A path is a logical channel
45
made up of data flow connecting several modules. Other two security mecha-
nisms adopted in Escort are filters and protection domains. A filter restricts the
interface between two adjacent modules. However, filters include no mechanism
to ensure that a module does not bypass the interface by directly accessing the
memory of the other module. A protection domain is a boundary drawn between
a pair of modules to ensure that the mutual access is performed only through the
defined interface. Filters and protection domains may be modeled respectively
by standard reflective channels and protected channels, i.e., channels executing
in a different address space. The concept of path is more complex and requires
an extension of the channel reification mechanism. A channel controlling a path
may be obtained by piping or composing the channels controlling the sequence
of modules forming a path or by defining a complex channel successively control-
ling the interface of a sequence of modules in a path. Another approach could be
based on a three-level reflective tower, but this approach increases the system
complexity without significant advantages.
4 Evaluation
}"
if (cell —— 34) { {I 34 is the int code for "R"
System.out.println("\nYou have the READ permission !");
return rem_objS.Method_Read();
else {
System.out.println("\nYou don't have the READ permission ! !");
return Boolean.FALSE;
}
}
public static void main(String args[]) {
cliannelRead rem_obj;
/ / creation and installation of the security manager
System.setSecurityManager( new RMISecurityManager{));
den to the application entities, thus the code of such entities is simpUfied. While
this can be accomplished by current DBMS authorization systems, we are now
controlling access to all executing entities, not just the application's persistent
data.
The advantages from the security point of view are that only the entities per-
forming the validation of the authorizations of an object know its authorizations
constraints. In this way authorization leaking is minimized.
A first drawback is that flexibility has a cost in terms of efficiency due to repeated
method activation. The Achille's heel of using reflection to realize security could
be its implementation mechanism. In fact, as it is stressed by our examples, each
reflective approach has a mechanism, which permits to forward each request to
meta-entities. These trap actions (shift-up and shift-down) are critical actions. It
is possible for a malicious user to intercept the trap and to hijack the request to
47
another entity bypassing the authorization system and illegally authorizing each
request. For this reason it is important to protect the meta-level and the access
to it and for the same reason it is important to hmit the meta-entities commu-
nication and, thus, using a reflective model having the global view feature. One
way to avoid this attack consists of implementing the whole meta-computation,
and in particular the trap actions, as protected routines. Fiom another point of
view this drawback can be converted into an advantage of the model with respect
to standard authorization models, because it minimizes the vulnerable points to
known system locations that can be more efficiently protected. The possibility
of using different address spaces (e.g., different processes) to implement the two
reflective layers with kernel system intervention for exchanging information rep-
resents an improvement to the security of the complete system.
Some recent proposals to control actions of Java applets have similar purposes
to our proposal. In those systems, e.g., in [13], execution domains are created
and enforced for specific applications using downloaded content. However, the
objective of these approaches is to control access to operating system resources,
e.g., files, memory spaces, ..., not to control high-level actions between objects.
5 Conclusions
Acknowledgments
A preliminary version of this work appears in the proceedings of the 1*' E C O O P
Workshop on Distributed Object Security, pages 35-39, Belgium, July 1998.
References
[1] Massimo Ancona, Walter Cazzola, Gabriella Dodero, and Vittoria Gi-
anuzzi. Channel Reification: a Reflective Approach to Fault-Tolerant
Software Development. In OOPSLA '95 (poster section), page 137,
Austin, Texas, USA, on 15th-19th October 1995. ACM. Available at
http://www.disi.unige.it/person/CazzolaW/references.html.
[2] Massimo Ancona, Walter Cazzola, Gabriella Dodero, and Vittoria Gianuzzi.
Channel Reification: A Reflective Model for Distributed Computation. In proceed-
ings of IEEE International Performance Computing, and Communication Confer-
ence (IPCCC'98), 98CH36191, pages 32-36, Phoenix, Arizona, USA, on 16th-18th
February 1998. IEEE.
[3] Ken Arnold and James Gosling. The Java Programming Language. The Java Se-
ries ... from the Source. Addison-Wesley, Reading, Massachussetts, second edition,
December 1997.
[4] Elisa Bertino, Sabrina De Capitani di Vimercati, Elena Ferrari, and Pierangela
Samarati. Exception-Based Information Flow Control in Object-Oriented Sys-
tems. ACM TYansactions on Information and System Security (TISSEC), 1(1),
November 1998.
[5] W. E. Boebert and R. Y. Kain. A Pratical Alternative to Hierarchical Integrity
Policies. In proceedings of 8th National Computing Security Conference, Gaithers-
burg, October 1985.
[6] Walter Cazzola. Evaluation of Object-Oriented Reflective Models. In proceed-
ings of ECOOP Workshop on Reflective Object-Oriented Programming and Sys-
tems (EWROOPS'98), in 12th European Conference on Object-Oriented Pro-
gramming (ECOOP'98), Brussels, Belgium, on 20th-24th July 1998. Available at
http://www.disi.unige.it/person/CazzolaW/references.html.
[7] Helen Custer. Inside Windows NT. Microsoft Press, Redmond, WA, 1993.
[8] Prangois-Nicola Demers and Jacques Malenfant. Reflection in Logic, Functional
and Object-Oriented Programming: a Short Comparative Study. In proceed-
ings of the workshop section, in IJCAr95 (International Join Conference on AI),
Montreal, Canada, August 1995.
[9] Eduardo B. Fernandez and J. C. Hawkins. Determining Role Rights from Use
Cases. In proceedings of the 2nd ACM Workshop on Role Based Access Control
(RBAC'97), pages 121-125, November 1997.
49
1 Introduction
The Internet and the World-Wide-Web provide a computational infrastructure that
spans the planet. It is appealing to imagine writing programs that exploit this global in-
frastructure. Unfortunately, the Web violates many familiar assumptions about the be-
havior of distributed systems, and demands novel and specialized programming
techniques. In particular, three phenomena that remain largely hidden in local-area net-
work architectures become readily observable on the Web:
• (A) Virtual locations. Barriers are erected between mutually distrustful admin-
istrative domains. Therefore, a program must be aware of where it is, and of how
to move or communicate between different domains. The existence of separate
administrative domains induces a notion of virtual locations and of virtual dis-
tance between locations.
• (B) Physical locations. On a planet-size structure, the speed of light becomes
tangible. For example, a procedure call to the antipodes requires at least 1/10 of
a second, independently of future improvements in networking technology. This
absolute lower bound to latency induces a notion of physical locations and phys-
ical distance between locations.
• (C) Bandwidth fluctuations. A global network is susceptible to unpredictable
congestion and partitioning, which results in fluctuations or temporary interrup-
tions of bandwidth. Moreover, mobile devices may perceive bandwidth changes
as a consequence of physical movement. Programs need to be able to observe
and react to these fluctuations.
These features may interact among themselves. For example, bandwidth fluctuations
may be related to physical location because of different patterns of day and night net-
work utilization, and to virtual location because of authentication and encryption across
domain boundaries. Virtual and physical locations are often related, but need not coin-
cide.
In addition, another phenomenon becomes unobservable on the Web:
• (D) Failures. On the Web, there is no practical upper bound to communication
delays. In particular, failures become indistinguishable from long delays, and
thus undetectable. Failure recovery becomes indistinguishable from intermittent
52
connectivity. Furthermore, delays (and, implicitly, failures) are frequent and un-
predictable.
These four phenomena determine the set of ohservables of the Web: the events or
states that can be in principle detected. Observables, in turn, influence the basic building
blocks of computation. In moving from local-area networks to wide-area networks, the
set of observables changes, and so does the computational model, the programming
constructs, and the kind of programs one can write. The question of how to "program
the Web" reduces to the question of how to program with the new set of observables
provided by the Web.
At least one general technique has emerged to cope with the observables character-
istic of a wide-area network such as the Web. Mobile computation is the notion that run-
ning programs need not be forever tied to a single network node. Mobile computation
can deal in original ways with the following phenomena:
• (A) Virtual locations. Given adequate trust mechanisms, mobile computations
can cross barriers and move between virtual locations. Barriers are designed to
impede access, but when code is allowed to cross them, it can access local re-
sources without the impediment of the barrier.
• (B) Physical locations. Mobile computations can move between physical loca-
tions, turning remote calls into local calls, and thus avoiding the latency limit.
• (C) Bandwidth fluctuations. Mobile computations can react to bandwidth fluc-
tuations, either by moving to a better-placed location, or by transferring code
that establishes a customized protocol over a connection.
• (D) Failures. Mobile computations can run away from anticipated failures, and
can move around presumed failures.
Mobile computation is also strongly related to recent hardware advances, since
computations move implicitly when carried on portable devices. In this sense, we can-
not avoid the issues raised by mobile computation: more than an avant-garde software
technique, it is an existing hardware reality.
In this paper, we discuss mobile computation at an entirely informal level; formal
accounts of our framework can be found in [ 13]. In Section 2 we describe the basic char-
acteristics of our existing computational infrastructure, and the difficulties that must be
overcome to use it effectively. In Section 3 we review existing ways of modeling dis-
tribution and mobility. In Section 4 we introduce an abstract model, the ambient calcu-
lus, that attempts to capture fundamental features of distribution and mobility in a
simple framework. In Section 5, we discuss applications of this model to programming
issues, including a detailed example and a programming challenge.
Administrative Domain
logical step apart. Moreover, computers can be considered immobile; for example, they
usually preserve their network address when physically moved.
Even in this relatively static environment, the notion of mobility has gradually ac-
quired prominence, in a variety of forms. For example;
• Control mobility. During an RPC (Remote Procedure Call) or RMI (Remote
Method Invocation) call, a thread of control is thought of as moving from one
machine to another and back.
• Data mobility. In RPC/RMI, data is linearized, transported, and reconstructed
across machines.
• Link mobility. The end-points of network channels, or remote object proxies,
can be transmitted.
• Object mobility. For load balancing purposes, objects can be moved between
different servers.
• Remote Execution. Computations can be shipped for execution on a server.
(This is an early version of code mobility, proposed as an extension of RPC.
[35])
In recent years, distributed computing has been endowed with greater mobility
properties and easier network programming. Techniques such as Object Request Bro-
kers have emerged to abstract over the location of objects providing certain services.
Code mobility has emerged in Tel and other scripting languages to control network ap-
plications. Agent mobility has been pioneered in Telescript [37], aimed towards a uni-
form (although wide-area) network of services. Closure mobility (the mobility of active
and connected entities) has been investigated in Obliq [11].
In due time, local-area-network techniques would have smoothly and gradually
evolved towards deployment on wide-area networks, e.g. as was explicitly attempted by
the CORBA effort. But, suddenly, a particular wide-aiea network came along that rad-
ically changed the fundamental assumptions of distributed computing and its pace of
progress; the Web.
More distressing is the fact that the Web does not behave like a LAN either. Many
proposals have emerged along the lines of extending LAN concepts to a global environ-
ment; that is, in turning the Internet into a distributed address space, or a distributed fde
system. However, since the global environment does not have the stability properties of
a LAN, this can be achieved only by introducing redundancy (for reliability), replica-
tion (for quality of service), and scalability (for management) at many different levels.
Things might have evolved in this direction, but this is not the way the Web came to be.
The Web is, almost by definition, unreliable, unpredictable, and unmanageable as a
whole, and was not designed with LAN-like guarantees of service.
' Still, a single faulty routing configuration file spread over the Internet in July 1997,
causing the disappearance of a large number of Internet domains. In this case, the
vulnerable "brain" was the collection of Internet routers.
56
ample, mobile Java applets provided the first disciplined mechanism for running code
able to (and allowed to) systematically penetrate other people's firewalls. Countless
projects have emerged in the last few years with the aim of supporting mobile compu-
tation over wide areas, and are beginning to be consolidated.
At this point, our architectural goal might be to devise techniques for managing
computation over an unreliable collection of fcir-flung computers. However, this is not
yet the full picture. Not only are network links and nodes widely dispersed and unreli-
able; they are not even liable to stay put, as we discuss next.
m
Tjs\ ® /^
Mental Image 3 focuses on two domains: the United States and the European
Union, each enclosed by a political boundary that regulates the movement of people and
57
computers. Within a political boundary, private companies and public agencies may
further regulate the flow of people and devices across their doors. Over the Atlantic we
see a third domain, representing Air France flight 81 travelling from San Francisco to
Paris. AF81 is a very active mobile computational environment: it is full of people
working with their laptops and possibly connecting to the Internet through airphones.
(Not to mention the hundreds of computers that control the airplane and let it commu-
nicate with ground stations.)
Abstracting a bit from people and computation devices, we see here a hierarchy of
boundaries that enforce controls and require permissions for crossing. Passports are re-
quired to cross political boundaries, reservations are required for restaurants, tickets are
required for airplanes, and special clearances are required to enter (and exit!) agencies
such as the NSA. Sometimes, whole mobile boundaries cross in and out of other bound-
aries and similarly need permissions, as the mobile environment of AF81 needs permis-
sion to enter an airspace. On the other hand, once an entity has been allowed across a
boundary, it is fairly free to roam within the confines of the boundary, until another
boundary needs to be crossed.
tently connected. Even barring flaky networks, intermittent connectivity can be caused
by physical movement, for example when a wireless user moves into some form of
Faraday cage. More interestingly, intermittent connectivity may be caused by virtual
movement, for example when an agent moves in and out of an administrative domain
that does not allow communication. Neither case is really a failure of the infrastructure;
in both cases, lack of connectivity may in fact be a desirable security feature. Therefore,
we have to assume that intermittent connectivity, caused equivalently by physical or
virtual means, is an essential feature of mobility.
In the future we should be prepared to see increased interactions between virtual
and physical mobility, and we should develop frameworks where we can discuss and
manipulate these interactions.
idea that resources are available transparently at any time, no matter how far away. In-
stead, we have to get used to the notion that movement and communication are step-by-
step activities, and that they are visibly so: the multiple steps involved cannot be hidden,
collapsed, or rendered atomic.
The action-at-a-distance paradigm is still prevalent within LANs, and this is anoth-
er reason why LANs are different from WANs, where such an assumption cannot hold.
2.6 Why a WAN is not a big LAN
We have already discussed in the Introduction how a WAN exhibits a different set of
observable than a LAN. But could one emulate a LAN on top of a WAN, restoring a
more familiar set of observables, and therefore a more familiar set of programming
techniques? If this were possible, we could then go on and program the Internet just like
we now program a LAN.
To turn a WAN into a LAN we would have to hide the new observables that a WAN
introduces, and we would have to reveal the observables that a WAN hides. These tasks
ranges from difficult, to intolerable, to impossible. Referring to the classification in the
Introduction, we would have to achieve the following.
(A) Hiding virtual locations. We would have to devise a security infrastructure that
makes navigation across multiple administration domains painless and transparent
(when legitimate). Although a great deal of cryptographic technology is available, there
may be impossibility results lurking in some corners. For example, it is far from clear
whether one can in principle guarantee the integrity of mobile computations against
hostile or unfair servers [33]. (This can be solved on a LAN by having all computers
under physical supervision.)
(B) Hiding physical locations. One cannot "hide" the speed of light; techniques
such as caching and replication may help, but they cannot fool processes that attempt to
perform long-distance real-time control and interaction. In principle, one could make
all delays uniform, so that programs would not behave differently in different places.
Ultimately this can be achieved only by slowing down the entire infrastructure, by em-
bedding the maximal propagation delay in all communications. (This would be about 1/
10 of a second on the surface, but would grow dramatically as the Web is extended to
satellite communication, orbital stations, and further away.)
(C) Hiding bandwidth fluctuations. It is possible to introduce service guarantees
in the networking infrastructure, and therefore eliminate bandwidth fluctuations, or re-
duce them below certain thresholds. However, in overload situations this has the only
effect of turning network congestion into access failures, which brings us to the next
point.
(D) Revealing failures. We would have to make failures as observable as on a
LAN. This is where we run into fundamental trouble. A basic result in distributed sys-
tems states that we cannot achieve distributed consensus (such as agreeing on which
nodes have failed) in a system consisting of a collection of asynchronous processes
[19]. The Web is such a system: we can make no assumption about the relative speed
of processors (they may be overloaded, or temporarily disconnected), about the speed
of communication (the network may be congested or partitioned), about the order of ar-
rival of messages, or even about the number of processes involved in a computation. In
60
3 Modeling Mobility
Section 2 was dedicated to showing that the reality of mobile computation over a WAN
does not fall into familiar categories. Therefore, we need to invent a new paradigm, or
model, that can help us in understanding and eventually in taking advantage of this re-
ality.
Since the Web is, after all, a distributed system, it is worth reviewing the existing
literature on models of distributed systems to see if there is something there that we can
already use. Readers who are not interested or experienced in models of concurrency,
may skip ahead to Section 4.
ity to communicate on that channel. It is perhaps best to think that a channel end-point
has been transmitted.
Let us consider a channel end-point that is transmitted across a domain boundary
over another channel that already crosses the boundary. If the transmitted end-point re-
mains functional, it provides a dynamically-established connection between the two
sides of the boundary. This is the kind of connection that firewalls typically forbid:
opening arbitrary network connections or allowing network-object proxy requests is not
allowed without further checks. The new channel that crosses the firewall could be seen
as an implicit firewall tunnel, but the establishment of trusted tunnels involves more
than simply passing a channel over another one, otherwise the firewall would loose all
control of security.
A firewall must watch the communication traffic over each channel that crosses it;
that is, it must act as an intermediary and forwarder of messages between the outside
and the inside of a domain. If a channel end-point is seen passing through, the firewall
must decide whether to allow communication on that channel, and if so it must create a
forwarder for it. So, a channel through a firewall must really be handled as two channels
connected by a filter [20].
Therefore, ability to communicate on a channel depends not only on possessing the
end-point of a channel, but also on where the other end-point of the channel is, and how
it got there. If the other end-point was sent through a firewall, then the ability to effec-
tively communicate on that channel depends on the attitude of the firewall.
Our approach: We provide a framework where processes exist in multiple disjoint
locations, and such that the location of a process influences its ability to communicate
with other processes. Dynamic connectivity is achieved by movement, but movement
does not guarantee continued connectivity.
Distribution
The jc-calculus has no inherent notion of distribution. Processes exist in a single contig-
uous location, by default, because there is no built-in notion of distinct locations and
their effects on processes. Interaction between processes is achieved by global, shared
names (the channel names); therefore every process is assumed to be within earshot of
every other process. Incidentally, this makes distributed implementation of the original
7t-calculus (and of CCS) quite hard, requiring some form of distributed consensus.
Various proposals have emerged to make the n-calculus suitable for distributed im-
plementation, and to extend it with location-awareness. The asynchronous 7t-calculus
[25] is obtained by a simple weakening of the 7t-calculus synchronization primitives.
Asynchronous messages simplify the requirements for distributed synchronization [32],
but they still do not localize the management of communication decisions. The join-cal-
culus [20] approaches this problem by rooting each channel at a particular process; this
provides a single place where synchronization is resolved. LLinda [18] is a formaliza-
tion of Linda [15] using process calculi techniques; as in distributed versions of Linda,
LLinda has multiple distributed tuple spaces, each with its local synchronization man-
ager.
Our approach: We restrict communication to happen within a single location, so
that communication can be locally managed. In particular, interaction is by shared lo-
cation, not by shared names. Remote communication is modeled by a combination of
mobility and local communication.
63
Locality
By locality we mean here distribution-awareness: a process has some notion of the lo-
cation it occupies, in an absolute or relative sense.
A growing body of literature is concentrating on the idea of adding discrete loca-
tions to a process calculus and considering failure of those locations [4, 21]. A notion
of locations alone is not sufficient, since locations could all really be in the "same
place". However, in presence of failures one could observe that certain locations have
failed and others have not, and deduce that those locations are truly in different places,
otherwise they would all have failed at the same time. The distributed join-calculus
[21], for example, adds a notion of named locations, and a notion of distributed failure;
locations form a tree, and subtrees can migrate from one part of the tree to another,
therefore becoming subject to different failure patterns.
This failure-based approach aims to model traditional distributed environments,
and traditional algorithms that tolerate node failures. However, on the Internet, node
failure is almost irrelevant compared with inability to reach nodes. Web servers do not
often fail forever, but they frequently disappear from sight because of network or node
overload, and then they come back. Sometimes they come back in a different place, for
example, when a Web site changes its Internet Service Provider. Moreover, inability to
reach a Web site only implies that a certain path is unavailable; it implies neither failure
of that site nor global unreachability. In this sense, a perceived node failure cannot sim-
ply be associated with the node itself, but instead is a property of the whole network, a
property that changes over time.
Our approach: The notion of locality is induced not by failures, but by the need to
cross barriers. Barriers produce a non-trivial and dynamic topology of locations. Loca-
tions are observably distinct because it takes effort to move between them, and because
some locations may be "absent" and are distinguishable from locations that are
"present". Failure is only represented, in a weak but realistic sense, as becoming forever
unreachable.
Mobility
There are different senses in which people talk about "process mobility"; we try to dis-
tinguish between them.
A K-calculus channel is mobile in the sense that it can be transmitted over another
channel. Let's imagine a process as having a mass, and its channels as springs connect-
ing it to other processes. When springs are added and removed by communication, the
process mass is pulled in different directions. Therefore, the set of channels of a process
influences its location, and a change of channels causes the process to change location.
This is particularly clear if a process has a single active channel at any time, because a
single spring will strongly influence a process location. By this analogy, channel mo-
bility can be interpreted as causing process mobility.
However, our desired notion of process mobility is tied to the idea of crossing do-
main barriers. This is a very discrete, on-off, kind of mobility: either a process is inside
a domain, or it is not. Representing this kind of mobility by adding and removing chan-
nels is not immediate. For example, if a 7C-calculus channel crosses a barrier (that is, if
64
^ The fact that the latter can be formally reduced to the former [34] is best ignored for
this discussion.
65
already required to handle boundaries, the need for cryptographic extensions does not
arise immediately. For example, a boundary enclosing a piece of text can be seen as an
encryption of the text, in the sense that a capability, the cryptokey, is needed to cross
the boundary and read the text. There is an unexpected similarity between a firewall sur-
rounding a major company and the encryption of a piece of data, both being barrier-
based security mechanisms at vastly different scales.
4 Ambients
We have now sufficiently discussed our design constraints and the deficiencies of ex-
isting solutions, and we are finally ready to explain our own proposal in detail. We want
to capture in an abstract way, notions of locality, of mobility, and of ability to cross bar-
riers. To this end, we focus on mobile computational ambients; that is, places where
computation happens and that are themselves mobile.
4.1 Overview
Briefly, an ambient, in the sense in which we are going to use this word, is a place that
is delimited by a boundary and where computation happens. Each ambient has a name,
a collection of local processes, and a collection of subamhients. Ambients can move in
and out of other ambients, subject to capabilities that are associated with ambient
names. Ambient names are unforgeable, this fact being the most basic security property.
In further detail, an ambient has the following main characteristics.
• An ambient is a bounded place where computation happens.
If we want to move computations easily we must be able to determine what parts
should move. A boundary determines what is inside and what is outside an ambient,
and therefore determines what moves. Examples of ambients, in this sense, are: a
web page (bounded by a file), a virtual address space (bounded by an addressing
range), a Unix file system (bounded within a physical volume), a single data object
(bounded by "self) and a laptop (bounded by its case and data ports). Non-examples
are: threads (where the boundary of what is "reachable" is difficult to determine) and
logically related collections of objects. We can already see that a boundary implies
some flexible addressing scheme that can denote entities across the boundary; exam-
ples are symbolic links, URLs (Uniform Resource Locators) and Remote Procedure
Call proxies. Flexible addressing is what enables, or at least facilitates, mobility. It
is also, of course, a cause of problems when the addressing links are "broken".
66
/T\
- ^
Enter reduction
In this situation the operation in m can execute, resulting in the configuration to the
right of the arrow. The result of the operation is that the folder n becomes a subfolder
of the folder m, and the gremlin who executed the instruction is ready to continue with
P. The instruction in m has been consumed. Any other gremlins or subfolders in Q and
R are unchanged.
A reduction can happen only if the conditions on the left of the arrow are satisfied.
That is, in m executes only if there is a sibling folder labeled m. Otherwise, the operation
remains blocked until a sibling folder labeled m appears nearby. Such a folder may ap-
pear, for example, because it moves near n, or because some of the gremlins in Q cause
/I to move near it.
68
Many reductions can be simultaneously enabled, in which case one is chosen non-
deterministically. For example, there could be two distinct folders labeled m near n, in
which case n could enter either one. Also, there could be another gremlin, as part of Q,
trying to execute an in m operation, in which case exactly one of the gremlins would
succeed. Moreover, there could be another gremlin in Q trying to execute an inp oper-
ation, and there could be another folder labeled p near n, in which case the n folder could
enter either m or p.
The Exit Reduction
Our second reduction is essentially the inverse of the previous one: two nested folders
become siblings. On the left of the arrow, the folder labeled n contains a gremlin that is
ready to execute the instruction out m (meaning: "let the surrounding folder exit its par-
ent labeled m"), and then continue with P. The parent folder is in fact labeled m.
In this situation the operation out m can execute, resulting in the configuration to
the right of the arrow. The result of the operation is that the folder n becomes a sibling
of the folder m, and the gremlin who executed the instruction is ready to continue with
P. The instruction outm\\as been consumed. Any other gremlins or subfolders in Q and
R are unchanged.
JZI
r^ \i)ut my f • )
Q - ^
R
Exit reduction
The operation out m executes only if the parent folder is labeled m. Otherwise, the
operation remains blocked until a parent folder labeled m materializes. The parent may
become m, for example, because some of the gremlins in Q cause n to move inside a
folder labeled m, at which point out m can execute.
Again, several reductions may be enabled at the same time. For example, there
could be a gremlin in Q trying to execute an in p operation, and there could be a folder
labeled p in R. Then, the folder n could either exit m or enter p.
The Open Reduction
Our third reduction is used to discard a folder while keeping its contents. In the picture
it is helpful to imagine that there is a folder surrounding the entities on the left of the
arrow, so that open n followed by P is a gremlin of that folder, and n is one of its sub-
folders. The gremlin is ready to execute the instruction open n (meaning: "let a nearby
folder labeled n be opened"), and then continue with P. A folder labeled n happens to
exist nearby.
In this situation the operation open n can execute, resulting in the configuration to
the right of the arrow. The result of the operation is that the folder n is discarded, but its
contents Q are spilled in the place where the folder n used to be. The instruction open n
has been consumed, and the gremlin is ready to continue with P.
69
I open
SD a -^ Open reduction
As before, the operation open n is blocked if there are no folders labeled n nearby.
If there are several ones, any one of them may be opened.
The Copy Reduction
Our fourth reduction is used to replicate folders and all their contents. On the left of the
arrow, a copy machine is ready to make copies of a folder P (actually, P could also be
a gremlin or any collection of folders and gremlins). We should imagine that the origi-
nal P is firmly placed under the cover of the copy machine, and is compressed into im-
mobility: none of the gremlins or folders of P can operate or move around (otherwise
the copy might be "blurred"). However, as soon as a copy of P is made, that copy is free
to execute.
P P
T ^ —^ T ^ p
Copy reduction
The copy machine can produce a new copy of P and of all its contents at will (no-
body needs to push the copy button). After that, the copy machine can operate again,
indefinitely. So, on the right of the arrow we have the same configuration as on the left,
plus a fresh copy of P. We could think that copies of/" are made on demand, whenever
needed, rather than being continuously produced.
Name Creation
The handling of names is a delicate and fundamental part of our calculus. Fortunately,
it is very well understood: it comes directly from the n-calculus.
Name creation is an implicit operation, in the sense that there are no reductions as-
sociated with it. It is represented below as the creation of a rubber stamp for a name n,
which can be used to stamp folder labels. Any number of folders can be stamped with
the same rubber stamp.
-S-
: ^ ;
"> X
A rubber stamp is used not as much to give a name, as to give authenticity to a fold-
er. There are several components to this notion, and we need to stretch our office met-
aphor a bit to make it fit with the intended semantics.
At a microscopic scale, each rubber stamp has slight imperfections that can be used
to tell which rubber stamp was used to stamp each particular folder. Therefore, the par-
ticular name chosen for a rubber stamp is irrelevant: what is really important is the re-
70
lationship between a rubber stamp and the folders it has stamped. Humans, however,
like to read names, not microscopic imperfection, so we keep names associated with
rubber stamps and folder labels. Still, we are free to change those names at any time, as
long as this is done consistently for a rubber stamp, all its related folders, and all the
other uses of the name on the rubber stamp. This consistent renaming could be consid-
ered as a reduction, but it is simple enough to be considered as a basic equivalence be-
tween configurations of folders, expressing the fact that superficial names are not really
important.
In our metaphor, a copy machine can be used to copy anything contained in a fold-
er, including rubber stamps. Therefore, even if we started with all the rubber stamps
having different names, eventually there might be multiple rubber stamps carrying the
same name. To make authenticity work, we have to assume that copy machines cannot
copy rubber stamps perfectly at the microscopic level: when a rubber stamp is replicat-
ed, a different set of microscopic imperfections is generated. That is, rubber stamps are
unforgeable by assumption.
For all these reasons, two rubber stamps carrying the same name n are really two
different rubber stamps. To preserve authenticity, we do not want these rubber stamps,
and the folders they stamp, to get confused. In our visual representation, we collect all
the folders stamped by a rubber stamp, and all the other occurrences of its name, within
a dashed boundary: this way we can always tell, graphically, which folders were authen-
ticated by a rubber stamp, even when different rubber stamps have the same name.
This dashed border is a flexible boundary and can move about fairly freely (it is just
a bookkeeping device). We have three main invariants for where a dashed border can
be placed. First it must always be connected with its original rubber stamp. Second, it
must always enclose all the folders that have ever been stamped with its particular
stamp and all other occurrences of the name (e.g. within gremlin code); if a folder
moves away, the dashed boundary may have to be enlarged. Third, dashed boundaries
for two rubber stamps with the same name must not intersect; if we should ever need to
do so we shall first systematically rename one of the two rubber stamps and the related
names, so that there is no confusion. The dashed boundaries for rubber stamps with dif-
ferent names can freely intersect.
allowed forbidden
It is allowable to nest dashed boundaries for rubber stamps with the same name: an
occurrence of that name will refer to the closest enclosing rubber stamp, in standard
block scoping style.
Leaves of the Syntax
At the bottom of our syntax there is the inactive gremlin, which can be represented as
an empty border. The inactive gremlin has no reductions.
71
o
Inactive gremlin
Inactive gremlins are often simply discarded or omitted. For example, multiple in-
active gremlins can be collapsed into one inactive gremlin, and a folder containing only
one inactive gremlin is usually represented as an empty folder.
0 0 = 0
£3
Programs in the folder calculus are built from these foundations, by assembling
collections of gremlins, folders, rubber stamps, and copy machines, and possibly plac-
ing them inside other folders.
The Theoretical Power of Mobility
Before moving on to our next and final reduction, we pause and consider the operations
we have introduced so far. These operations are purely combinatorial, that is, they in-
troduce no notion of parameter passing, or message exchange. Also, they deal purely
with mobility, not with communication or computation of any familiar kind. Yet, they
are computationally complete: Turing machines can be encoded in a fairly direct way
(see [13]).
Moreover, very informally, it is possible to see an analogy between the Enter re-
duction and increment, the Exit reduction and decrement, the Open reduction and test,
and the Copy reduction and iteration. Therefore all the ingredients for performing arith-
metic are present, and it is in fact possible to represent numbers as towers of nested fold-
ers, with appropriate operations.
Data structures such as records and trees can be represented easily, by nested fold-
ers; folder names represent pointers into such data structures. Storage cells can be rep-
resented as folders whose contents change in response interactions with other folders;
in this case a folder name represents the address of a cell.
In summary, the mobility part of the Folder Calculus already has the full power of
any computationally complete language. In principle, one could use it as a graphical
scripting language for the office/desktop metaphor.
the simplest imaginable. In particular, they do not conflict with the notion of strict fold-
er containment, and do not duplicate mobility functionality.
The Read Reduction
We begin by introducing a new entity that can sit inside a folder: a message. To remain
within the office metaphor, we imagine writing messages onto throw-away Post-it notes
that are attached to the inside of folders.
A gremlin can write the name of a folder on a note, and can attach the note to the
current folder (the folder the gremlin is in). This is an output operation, and is represent-
ed graphically by a message written on a note. We shall discuss shortly what are the par-
ticular messages used in the folder calculus. More generally, we may imagine writing
any kind of data as a message; in this view, a note can be seen as nameless data file that
is kept within a folder.
1\
M
Output
Conversely, a gremlin can grab any note attached to the current folder, read its mes-
sage, discard the note, and proceed with the knowledge of the message. This is an input
operation, and is represented graphically by a process P with occurrences of the variable
X (written P[x]) that is waiting for a note with a message to be bound to x.
Input
The Read reduction is the interaction between input and output operations or,
equivalently, between message notes and input operations. In a situation where an input
and an output are present, the Read reduction can execute, resulting in the configuration
on the right of the arrow, which is simply P{ A/}: the residual gremlin P that has read M
into X. The note that appears on the left of the arrow is discarded: it is consumed by read-
ing and does not appear on the right. (If the note needs to persist, it can be replicated by
a copy machine.)
T^
M —^ P{M]
Read reduction
An input operation blocks if there are no available messages. Several input opera-
tions may contend for the same message and only one of them will obtain it and be able
to proceed. An output operation, however, never blocks (it is asynchronous); it simply
drops a message in the current folder and has no continuation.
Inputs and outputs usually happen within a folder (although the reduction above al-
73
lows them to happen on the desktop). Within such a folder, the identity of the input or
output gremlins are not important: anybody can talk to anybody else, in the style of a
chat room. In practice, different folders will be dedicated to different kinds of conver-
sation, so one can make assumptions about what should be said and what can be heard
in a given folder. This idea gives rise to a type system for the folder calculus [14].
Messages and Capabilities
We can imagine many kinds of messages that can be written on Post-it notes. Here we
focus on messages that are capabilities: they allow the reader of a message to perform
privileged actions. There are two kinds of capabilities, used in different contexts: nam-
ing capabilities and navigation capabilities.
Naming capabilities are simply ambient names used as messages. A name n can be
seen as a capability to construct (and rubber-stamp) a folder named n.
We have already seen the main navigation capabilities, implicitly. We have pre-
sented the operations in n, out n and open n as three distinct operations followed by a
continuation P. In fact, they are special cases of a single operation M.P, where A/ is a
navigation capability and P is the continuation after navigation. Given a name n, in n is
the capability to enter an ambient named n, out n is the capability to exit an ambient
named n, and open n is the capability to open an ambient named n. Navigation capabil-
ities are extracted from naming capabilities, meaning that knowing n implies the ability
to construct, for example, in n, but knowing in n does not imply knowing n.
Navigation capabilities can be composed into navigation paths. For example, in n.
out m. open p. out ^ is a path capability that can be written in a single message, read into
an input variable x, and executed in its entirety by x.P (assuming, of course, that the path
can be followed). It useful to have an empty navigation path, written here, such that
here.P has no effect and continues with P, and M.here = here.M = M.
Example
This is an artificial example that uses each of the five folder reductions once. We use a
single dashed line for multiple rubber stamps when the order of nesting of dashed lines
does not matter. We remove rubber stamps when they are no longer needed.
P1 \ n l I m I
Enter
74
^—————
/
rr\
r?ir^\
Open ' ppi Copy
^1 \outp\
• t^
Read Exit
/T"V £1^
The reader is encouraged to follow the reductions, comparing them with their defini-
tions.
Example: Adobe Distiller
Adobe Distiller is a program that converts files in Postscript format to files in Adobe
Acrobat format. The program can be set up to work automatically on files that are
placed in a special location. In particular, when a user drops a Postscript file into an in-
box folder, the file is converted to Acrobat format and dropped into a nearby outbox
folder.
The following figure describes such a behavior. The distiller folder contains the in-
box and outbox folders mentioned above; outbox is initially empty. The input folder
contains the file the user wants to convert, in the form of a message. The input folder
contains also a gremlin that moves the input folder into the inbox. (We can imagine that
this piece of gremlin code is generated automatically as a result of the user dragging the
input folder into the inbox folder.)
75
X distiller\
/outbo^^
open input
S distill(xj\
I out inhox. in i>ulhoJ^>\
The inbox contains the program necessary to do the format conversion and drop the
resuh into the outbox. First, any input folder arriving into the inbox must be opened to
reveal the Postscript file; this is done by the copy machine on the left. Then, any such
file is read; this is done by the copy machine on the right. As a result of each read, an
output folder is created to contain a result. Inside each output folder, a file is distilled
(by the external operation distill{x)) and left there as an output. The output folder is
moved into the outbox folder.
It should be noted that the program above represents highly concurrent behavior,
according to the reduction semantics of the folder calculus. Multiple files can be
dropped into the inbox and can be processed concurrently. The opening of the input
folders and the reading of their contents is done in a producer-consumer style. More-
over, each distilling process may be executing while its output folder is traveling to the
outbox. Representing this behavior in an ordinary concurrent language would not be en-
tirely trivial; here we have been able to express it without cumbersome locking and syn-
chronization instructions.
Example: Synchronous Output
It is sometimes useful to know that a message has been read by somebody before pro-
ceeding. Synchronous output is an output operation with a continuation that is triggered
only after the output message has been read.
Synchronous output is expressible within the folder calculus, if we assume that the
calculus has been extended to allow the exchange of pairs of values (this extension can
in fact be encoded within the calculus). Together with synchronous output, we need to
define a matching input operation. These new operations are depicted with additional
striped borders in the figures.
The synchronous output of a message M is obtained by creating a fresh name, k,
outputting (asynchronously) the pair M,k, and waiting for the appearance of a folder
named k before proceeding.
76
.o_]-
tl^ - ;^@ niF^]
The corresponding input operation for a variable x, is obtained by expecting an input
pairx,^, creating an empty folder named k (which triggers the synchronous output con-
tinuation), and continuing with the normal use of the input jc.
MJ
Therefore, when the process P starts running, it can assume that somebody has read the
message M.
4.4 Security
The folder calculus has built-in features that allow it to represent security and encryp-
tion situations rather directly; that is, without extending the calculus with ad-hoc prim-
itives for encryption and decryption. In this section, we discuss a number of examples
based on simple security protocols.
We should make clear here what we mean by security for the folder calculus. Se-
curity problems arise at every level of a software system, not just at the cryptographic
level. Given any set of security primitives, and any system written with those primi-
tives, one can ask whether the system can be attached at the "low-level", by attacking
weaknesses in the implementation of the primitives, or at the "high-level", by attaching
weaknesses in how the primitives are used. Efforts are underway to study the security
of high-level abstractions under low-level attacks [1], but here we are only concerned
with high-level attacks. That is, we assume that an attacker has at its disposal only the
primitives of the folder calculus. This is the kind of attack that a malicious party could
mount against honest folders interacting within a trusted server or over a trusted net-
work. For example, even within a perfectly trusted server, if a folder gives away its
name a, it could be killed by an attacker performing open a.
Authentication
In this example, a home folder is willing to let any folder in, but is willing to open only
those folders that are recognized as having originally come from home. Opening a fold-
er implies conferring top-level execution privilege to its gremlins, and this privilege
should not be given to just anybody. A particular home gremlin that has execution priv-
ilege wants to leave home and then wants to come back home and be given the same
execution privilege it enjoyed before.
The mechanism the home folder uses to recognize the gremlin is a pass: a use-once
authentication token. Passes are generated in the left-hand side of the figure below,
where a copy machine produces fresh configurations, each consisting of a new rubber
77
Stamp, a single message (the pass) stamped by the rubber stamp, and a single open ca-
pability for the name on the rubber stamp.
The traveling gremlin is on the right-hand side of the figure. It inputs a pass n by
reading a message into a variable x, and it eventually uses the pass to label a folder. The
gremlin, in the form of the folder g, takes a short walk outside and comes back. Then it
exposes a folder named n, which is opened by the corresponding open n capability that
was left behind. The gremlin P can then continue execution at the top level of home; for
example, it may read another pass and leave again.
Since the scope of each pass n is restricted by a locally-generated rubber stamp, the
capability open n is not at risk of ever opening some foreign folder. There are actually
two underlying security assumptions here. The first is that nobody can accidentally or
maliciously create a pass that matches n: this can be guaranteed in practice with arbi-
trarily high probability. The other assumption is that nobody can steal the name n from
the traveling gremlin. This seems very hard or impossible to guarantee in general, par-
ticularly if the gremlin visits a hostile location that disassembles the gremlin by low-
level mechanisms (below the abstraction level of the folder calculus). However, if the
gremlin visits only trusted locations through trusted networks (ones that preserve the
abstractions of the folder calculus), then no interaction can cause the gremlin to be un-
willingly deprived of its pass.
Nonces
A nonce is a use-once token that can be used to ensure the freshness of a message. In
the example below, a nonce is represented by a fresh name n. The folder a sends the
nonce to the folder b, where the nonce is paired with a message M and sent back.
The folder a then checks that the nonce returned by h is the same one that was sent
to b. This is achieved by creating an empty folder named by the returned nonce, and try-
ing to open that folder with the original nonce. If the test is successful, then a knows
that the message M is fresh: it was generated after the creation of n.
If nonce and msg are public names, then an attacker could disrupt this protocol by
destroying the message folders in transit. Even worse, an attacker could inject nonce
and msg folders into aor b containing misleading information or damaging code. If a
and b have already established shared keys, they can avoid these problems by exchang-
ing and opening only messages encrypted under the keys (this is discussed shortly).
78
n' Imsg
I out h. in a ^ ^ ^
^ ^
An attacker could also impersonate a orfeby creating folders with those names, and
could then intercept messages. However, the names of principals like a and b will nor-
mally be closely guarded secrets, so that impersonation cannot happen. In contrast, ca-
pabilities like in a will be given out freely, since the act of entering a folder cannot by
itself cause damage to the folder, even if who enters is malicious.
Shared Keys
A name can be used as a shared key, as long as it is kept secret and shared only by cer-
tain parties. A shared key can be reused multiple times, e.g., to encrypt a stream of mes-
sages.
A message encrypted under a key k can be represented as a folder that contains the
message and whose label is k. We call such a folder a k-en\dope for the message.
Knowledge of k (or, at least, of the capability open k) is required to open the folder and
read the message.
£T\ encryption:
~M\ plaintext M inside a A:-envelope
I open k decryption:
opening a ^-envelope and reading the contents
\
p
\P Replication of P.
80
(M) M Output M.
{n).P
\D Input n followed by P.
(P)
O Grouping.
Messages M,N
n A name
in n An entry capability
out n An exit capability
open n An open capability
here The empty path of capabilities
M.N The concatenation of two paths
The textual syntax for the folder calculus is in fact the full syntax of the ambient
calculus we were alluding to previously. Therefore, our folder calculus metaphor is
quite exact: the syntax and semantics of our formal ambient calculus [ 13] has now been
completely explained through the metaphor.
Example: Adobe Distiller
This is the textual representation of the example in Section 4.3.
distiller{
inbox[
\open input I
!(x) output[{distill{x)) I out inbox. in outbax]] I
outhox[\]
constructs that are semantically compatible with the principles of the ambient calculus,
and consequently of wide-area-networks.
These principles include (A) WAN-soundness: a wide-area network language can-
not adopt primitives that entail action-at-a-distance, continued connectivity, or security
bypasses, and (B) WAN-completeness: a wide area network language must be able to
express the behavior of web surfers and of mobile agents and users.
5.1 Related Languages
Many software systems have explored and are exploring notions of mobility and wide-
area computation. Among these are:
• Obliq [11]. The Obliq language attacks the problems of remote execution and
mobility for distributed computing. It was designed in the context of local area
networks. Within its scope, Obliq works quite well, but is not really suitable for
computation and mobility over the Web, just like most other distributed para-
digms developed in pre-Web days.
• Telescript [37]. Our ambient model is partially inspired by Telescript, but is al-
most dual to it. In Telescript, agents move whereas places stay put. Ambients,
instead, move whereas processes are confined to ambients. A Telescript agent,
however, is itself a little ambient, since it contains a "suitcase" of data. Some
nesting of places is allowed in Telescript.
• Java [23]. Java provides a working paradigm for mobile computation, as well as
a huge amount of available and expected infrastructure on which to base more
ambitious mobility efforts. The original Java mobility model, however was
based on mobility of code, not mobility of data or active computations. Data mo-
bility has now been achieved by the Java RMI extension, but computation mo-
bility (e.g. for threads or live objects) is still problematic.
• Linda [15]. Linda is a "coordination language" where multiple processes inter-
act in a common space (called a tuple space) by dropping and picking up tokens
asynchronously. Distributed versions of Linda exist that use multiple tuple spac-
es and allow remote operations over those. A dialect of Linda [16] allows nested
tuple spaces, but not mobility of the tuple spaces.
• The Join Calculus Language [22] is an implementation of the distributed Join
Calculus. The plain Join Calculus introduced an original and elegant synchroni-
zation mechanism, where a procedure invocation may be triggered by the join of
multiple partial invocations originating from different processes. The Distribut-
ed Join Calculus extends the Join Calculus with an explicit hierarchy of loca-
tions. As we already mentioned, the nature of this calculus makes distributed
implementation relatively easy. Migration of locations is allowed within the hi-
erarchy. Behavior of the system under a failure model is being investigated.
• WebL [27] is a language that specializes on fetching and processing Web pages.
It uses service combinators [12] to retrieve streams of data from unreliable serv-
ers, and it uses a sophisticated pattern matching sublanguage for analyzing struc-
tured (but highly variable) data and reassembling it.
82
(such as mainframe computers). These mobility distinctions are not reflected in the se-
mantics of ambients, but can be added as a refinement of the basic model, or embedded
in type systems that restrict the mobility of certain ambients.
Migration and Transportation
Each ambient is completely self-contained, and can be moved at any time with all its
running computations. If an ambient encloses a whole application, then the whole run-
ning application can be moved without need to restart it or reinitialize it. In practice, an
application will have ties to the local window system, the local file system, etc. These
ties, however, should only be via ambient names. When moving the applications, the
old window system ambient, say, becomes unavailable, and eventually the new window
system ambient becomes available. Therefore, the whole application can smoothly
move and reconnect its bindings to the new local environment. Some care will still be
needed to restart in a good state (say, to refresh the application window), but this is a
minor adjustment compared to what one would have to do if hard connections existed
between the application and the environment [5].
Communication
The basic communication primitives of the ambient calculus are based on the asynchro-
nous model and do not support global consensus or failure detection. These properties
should be preserved by any higher-level communication primitives that may be added
to the basic model, so that the intended semantics of communication over a wide-area
networks is preserved.
The ambient calculus directly supports only local communication within an ambi-
ent. Remote communication (for example, RPC) can be interpreted as mobile output
packets that transport and deposit messages to remote locations, wait for local input, and
then come back. The originator of an RPC call may block for the answer to come back
before proceeding, in the style of a synchronous call. In this interpretation, the outgoing
and incoming packets may get stuck for an arbitrary amount of time, or get lost. There
may be no indication that the communication has failed, and therefore the invoking pro-
cess may block forever without receiving a communication exception. This is as it
should be, since arbitrary delays are indistinguishable from failures. (Note, though, that
a time-out mechanism is easily implemented, by placing a remote invocation in parallel
with another activity that waits a certain time and, if the invocation has not completed,
cause something else to happen.)
Other examples of derived communication mechanisms include parent-child com-
munication, and communication between siblings (perhaps aided by the common par-
ent). All these appear quite useful, and will likely need to be included in any convenient
language.
Data Structures
Basic data structures, such as booleans and integers, can be encoded in terms of ambi-
ents, but the encodings are not practical. Therefore, basic types should be taken as prim-
itive, as usual.
Ambients can directly express hierarchies, so it should not be surprising that they
can easily represent structured data types. For example, a record structure of the form
84
f{y\...V„)lhere k P{x,^V|...x„^V„)
where the suffix "/here" indicates that a definition of/has to be found in the ambient
where/is invoked. In general, a path may be used in place of here, in which case the
definition of the resource is retrieved from the ambient obtained by following the path,
and the invocation is executed there. In other words, the effect of/(V|... V„)//7 is to trans-
port the invocation/(Vi...V„) along the path p, and then to invoke it within the target am-
bient. This can be expressed by two rules that each transport an invocation one step
further:
fiV,...V„yinn.p\n[Q] 4 n[Q\J{V,...V„)/p]
n\fiVi...V„)/out n.p I Q] 4 f(V,...V„)/p I n[Q]
For example, we could have an expression such as the following:
n[defxQ = (l);
deffiy) = x{)lhere\(x').{x'+y)]
\fO)linn
producing, after the invocation/(3)/in n:
n{defxQ = (\);
deffiy)=xO/here I {x'). {x'+y);
(4)]
The definition and invocation of resources can be encoded in the pure ambient cal-
culus; this way, resources automatically acquire an ambient-like flavor. In particular, if
an ambient has no definition for an invocation, the invocation blocks until the definition
becomes available in the designated place. If an ambient has multiple definitions for an
invocation, any one of them may be used. If an ambient containing definitions is
opened, its definitions become definitions of its parent.
ModuUuization
An ambient P that includes a collection of local definitions can be seen as a module, or
a class. More precisely, [P can be seen as a module or a class (since P there is inactive),
while any active P generated from it can be seen as a module instance or an object.
The action of performing an open on such an ambient can be seen as importing
from a module or inheriting from a class, since ambient definitions are transplanted
from one ambient to another. Moreover, one can regard the notation/(Af)/p either in
module-oriented style as p.j{M) (the invocation oif{M) from the module at/?), or in ob-
ject-oriented style as delegation [36] of /(A/) to the object found at p.
When seen as modules or components, ambients have several interesting and un-
usual properties.
First, ambients ^XQfirstclass modules, in the sense that one can choose at run time
which particular instance of a module to use.
Second, ambients support dynamic linking: missing subsystems can be added to a
running system simply by placing them in the right spot.
Third, ambients support dynamic reconfiguration. In most module and class sys-
87
terns, the identity of individual modules is lost immediately after static or dynamic link-
ing. Ambients, though, maintain their identity at run time. As a consequence, a system
composed of ambients can be reconfigured by dynamically replacing an ambient with
another one. The blocking semantics of ambient interactions allows the system to
smoothly suspend during a configuration transition. Moreover, the hierarchical nature
of ambients allows the modular reconfiguration of entire subsystem, not just of individ-
ual modules. Dynamic reconfiguration is particularly valuable in long-running and
widely-deployed systems that cannot be easily stopped (for example, in telephone
switches); it is certainly no accident that thinking about the Internet led us to this prop-
erty.
Therefore, ambients can be seen as proper software components, according to a
paradigm often advocated in software engineering, where the components are not only
replaceable but also mobile.
Security
Ambient security is based on capabilities and on the notion that security checks are per-
formed at ambient boundaries, after which processes are free to execute until they need
to cross another boundary. This is a capability-based model of security, as opposed to a
cryptography-based model, or an access-control based model.
These three models are all interdefinable. In our case, access control is obtained by
using ambients to implement RPC-like invocations that have to cross boundaries and
authenticate every time. The cryptographic model is obtained by interpreting encryp-
tion keys as ambient names, which are by assumption unforgeable. Then, encryption is
given by wrapping some content in an ambient of a given name, and decryption is ob-
tained by either entering or opening such an ambient given an appropriate capability
(the decryption key).
Summary
We believe we have sufficiently illustrated how the ambient semantics naturally induc-
es unusual programming constructs that are well-suited for wide-area computation. The
combination of mobility, security, communication, and dynamic binding issues has not
been widely explored yet at the language-design level, and certainly not within a unify-
ing semantic paradigm. We hope our unifying foundation will facilitate the design of
such new languages.
5.3 Example: Public IVansportation
We now show an example of a program written in ambient notation. Some additional
constructs used here have been introduced in the previous section.
This example emphasizes the mobility aspects of ambients, and the fact that an am-
bient may be transported from one place to another without having to know the exact
route to be followed. A passenger on a train, for example, only needs to know the des-
tination station, and need not be concerned with the intermediate stations a train may or
may not stop at.
In this example, there are three train stations, represented by ambients: stationA,
stationB and stationC (of course, these particular ambients will never move). There are
three trains, also represented by ambients: a train from stationA to stationB originating
88
at stationA, and two trains between stationB and stationC, one originating at each end.
There are two passengers, again represented by ambients; /oe and nancy. Joe wants to
go from StationA to stationC by changing trains at stationB; nancy wants to go the other
way.
joe ^ * "^
trainAB trainBC 1
We begin by defining a parametric process that can be instantiated to trains going
between different stations at different speeds. The parameters are: stationX: the origin
station; stationY: the destination station; XYatX, the tag that the train between X and Y
displays when stationed at X; XYatY, the tag that the train between X and Y displays
when stationed at Y\ tripTime, the time a train takes to travel between origin and desti-
nation.
let train{stationX stationY XYatX XYatY tripTime) =
(v moving) II assumes the train originates inside stationX
moving[rec T.
be XYatX. wait 2.0.
be moving, out stationX. wait tripTime. in stationY.
be XYatY. wait 2.0.
be moving, out stationY. wait tripTime. in stationX.
n
The definition of a train begins with the creation of a new name, moving, that is in-
termittently used as the name of the train. While the train is moving, passengers should
not be allowed to (dis)embark; this is achieved by keeping moving a secret name. The
train begins as an ambient with name moving, and contains a single recursive thread that
shuttles the train back and forth between two stations. Initially, the train declares itself
to be a train between X and Y stationed at X, and waits some time for passengers to enter
and exit. Then it becomes moving, so passengers can no longer (dis)embark. It exits the
origin station, travels for the tripTime, enters the destination station, and declares itself
to be the train between X and Yat Y. Again, passengers can (dis)embark during the wait
time at the station. Then the train becomes moving again, goes back the other way, and
then repeats the whole process.
Next we have the configuration of stations and trains. We create fresh names for
the three stations and for the train tags. Then we construct three ambients for the three
stations, each containing an appropriately instantiated train.
(v StationA stationB stationC ABatA ABatB BCatB BCatC)
stationA[train{stationA stationB ABatA ABatB 10.0)] I
stationB[train{stationB stationC BCatB BCatC 20.0)] I
stationC[train(stationC stationB BCatC BCatB 30.0)] I
89
Finally, we have the code for the passengers. Joes itinerary is to enter stationA,
wait to enter the train from A to B when it is stationed at A, exit at B, wait for the train
from fi to C when it is stationed at B, exit at C and finally exit stationC. During the time
thatyoe is waiting to exit a train, he is blocked waiting for the train to acquire the appro-
priate tag. The train could change tags at intermediate stations, but this would not affect
joe, who is waiting to exit at a particular station. When that station is reached, and the
train assumes the right tag, joe will attempt to exit. However, there is no guarantee that
he will succeed. For example, joe may have fallen asleep, or there may be such a rush
that joe does not manage to exit the train in time. In that case,7oe keeps shuttling back
and forth between two stations until he is able to exit at the right station.
(vjoe)
joe[
in StationA.
inABatA. out ABatB.
in BCatB. out BCatC.
out StationC] I
The code for nancy is similar, except that she goes in the other direction. Given the
timing of the trains, it is very likely that nuncy will meet/oe on the platform at stationB.
(v nancy)
nancy [
in StationC.
in BCatC. out BCatB.
in ABatB. out ABatA.
out StationA]
In all this,7oe and nancy are active ambients that are being transported by other am-
bients. Sometimes they move of their own initiative, while at other times they move be-
cause their context moves. Note that there are two trains between stationB and stationC,
which assume the same names when stopped at a station. Joe and nancy do not care
which of these two train they travel on; all they need to know is the correct train tag for
their itinerary, not the "serial number" of the train that carries them. Therefore, having
multiple ambients with the same name simplifies matters.
When all these definitions are put together and activated, we obtain a real-time sim-
ulation of the system of stations, trains, and passengers. A partial trace looks like this:
nancy: moved in stationC
nancy: moved in BCatC
joe: moved in stationA
joe: moved in ABatA
joe: moved out ABatB
nancy: moved out BCatB
joe: moved in BCatB
nancy: moved in ABatB
90
Review
Each committee member is a reviewer, and may decide to review the paper directly, or
to send it to another reviewer. The review form keeps tracks of the chain of reviewers
so that it can find its way back when either completed or refused, and so that each re-
viewer can check the work of the subreviewers. Eventually a review is filled. The form
performs various consistency checks, such as verifying that the assigned scores are in
range and that no required fields are left blank. Then it finds its way back to the program
chair.
Report generation
Once the review forms reach the program chair, they become report forms. The vari-
ous report forms for each paper merge with each other incrementally to form a single
report form that accumulates the scores and the reviews. The program chair monitors
the report form for each paper. If the reviews are in agreement, the program chair de-
clares the form an accepted paper report form, or a rejected paper review form.
Conflict resolution
If the reports are in disagreement, the program chair declares the form an unresolved
review form. An unresolved review form circulates between the reviewers and the pro-
gram chair, accumulating further comments, until the program chair declares the paper
accepted or rejected.
Notification
The report form for an accepted or rejected paper finds its way back to the author (minus
the confidential comments), with appropriate congratulations or regrets.
Final versions
Once it reaches the author, an accepted paper report form spawns a final submission
form. In due time, the author attaches to it the final version of the paper and signs the
copyright release notice. The completed final submissions form finds its way back to
the program chair.
Proceedings
The final submission forms, upon reaching the program chair, merge themselves into
the proceedings. The program chair checks that all the final versions have arrived, sorts
them into a conference schedule, attaches a preface, and lets the proceedings find their
way to the conference chair.
Publication
The conference chair files the copyright release forms, signs the proceedings, and posts
them to public sites.
Comments
A few critical features characterize this application as particularly well-suited for exper-
imenting with wide-area languages.
First, it is a requirement of this application that most interactions happen in absence
92
of connectivity. Virtual committee meetings occur over the span of one month of re-
viewing and one or two weeks of discussion. It is highly unlikely that all the committee
members will be continuously near their main workstation, or any workstation, during
such a span of time. Yet, progress cannot be interrupted by the temporary absence of
any one member. Furthermore, progress cannot be intenupted by the absence of con-
nectivity for any one member: paper reviews are commonly done on airplanes, in doc-
tors waiting rooms, in lines at the Post Office, in cafe's, etc. While a laptop or personal
organizer can be easily carried in those environments, continuous connectivity is far
from easy to achieve. This is to be contrasted with current web-based review systems,
which require reviewers to sit at a connected workstation while filling the review forms.
Second, form-filling requires semantic checking, which is best done while the form
is being filled. Therefore, active forms are required even during off-line operation. This
is to be contrasted with the filling of on-line Web-based review forms which require, in
practice, preparing reviews off-line on paper or in ASCII, and later typing them or past-
ing them laboriously into on-line forms in order to obtain the semantic checking. Alter-
natively, if the review is simply e-mailed in ASCII, then the program chair has the
considerable burden of performing the parsing and semantic checking.
Third, unattended operations is highly desirable also for the program chair. The
program chair may go in vacation after the assignment phase and come back to find all
the report forms already merged, thanks to the use of active forms.
Fourth, the system must handle multiple administrative domains. Committee mem-
bers are intentionally selected to belong to widely diverse and dispersed institutions,
many of which are protected by firewalls. In this respect, this situation is different from
classical office workflow on a local area network, although it shares many fundamental
features with it.
Fifth, all the forms are active. This relieves various principals from the tedious and
error-prone task of collecting, checking, and collating pieces of information, and dis-
tributing them to the correct sets of other principals.
In summary, in this example, interactions between various parts of the system hap-
pen over a wide-area network. The people involved may be physically moving during
or between interaction. As they move, they may transport without warning active parts
of the system. At other times, active parts of the system move by their own initiative
and must find a route to the appropriate principals wherever they are.
6 Conclusions
The global computational infrastructure has evolved in fundamental ways beyond stan-
dard notions of sequential, concurrent, and distributed computational models. Mobile
(imbients capture the structure and properties of wide-area networks, of mobile comput-
ing, and of mobile computation. The ambient calculus [13] formalizes these notions
simply and powerfully. It supports reasoning about mobility and security, and has an
intuitive presentation in terms of the folder calculus. On this basis, we can envision new
programming methodologies, libraries and languages for global computation.
93
7 Acknowledgments
The ideas presented in this paper originated in the heated atmosphere of Silicon Valley
during the Web explosion, and were annealed by the cool and reflective environment of
Cambridge UK. I am deeply indebted to people in both locations, particularly to An-
drew Gordon, who is also a coauthor of related papers. In addition, Martin Abadi and
Cedric Fournet made comments and suggestions on recent drafts.
References
[ I ] Abadi, M., Foumet, C , Gonthier, G.: Secure Implementation of Channel Abstractions. Proc.
of the Thirteenth Annual IEEE Symposium on Logic in Computer Science (1998) 105-116.
[2] Abadi, M., Gordon, A.D.: A Calculus for Cryptographic Protocols: the Spi Calculus. Proc.
of the Fourth ACM Conference on Computer and Communications Security (1997) 36-47.
[3] Agha, G.A.: Actors: A Model of Concurrent Computing in Distributed Systems. MIT Press
(1986).
[4] Amadio, R.M.: An Asynchronous Model of Locality, Failure, and Process Mobility. Proc.
COORDINATION 97, Lecture Notes in Computer Science 1282, Springer Verlag (1997).
[5] Bharat, K., Cardelli, L.: Migratory Applications. Proc. of the ACM Symposium on User In-
terface Software and Technology '95 (1995) 133-142.
[6] Berry, G.: The Foundations of Esterel. In: Plotkin, G., Stirling C , Tofte, M. (eds.): Proof
Language and Interaction: Essays in Honour of Robin Milner. MIT Press (1998).
[7] Berry, G., Boudol, G.: The Chemical Abstract Machine. Theoretical Computer Science
96(0(1992)217-248.
[8] Boudol, G.: Asynchrony and the Ji-calculus. Technical Report 1702, INRIA, Sophia-Antip-
ohs(I992).
[9] Bracha, G., Toueg, S.: Asynchronous Consensus and Broadcast Protocols. I.ACM 32(4)
(1985)824-840.
[10] Brauer, W. (ed.): Net Theory and Applications. Proc. of the Advanced Course on General
Net Theory of Processes and Systems, Hamburg, 1979. Lecture Notes in Computer Science
84. Springer-Vedag (1980).
[11] Cardelli, L.: A Language with Distributed Scope. Computing Systems, 8(1), MIT Press
(1995)27-59.
[ 12] Cardelli, L., Davies, R.: Service Combinators for Web Computing. Proc. of the First Usenix
Conference on Domain Specific Languages, Santa Barbara (1997).
[13] Cardelli, L., Gordon, A.D.: Mobile Ambients. In: Nivat, M. (ed.): Foundations of Software
Science and Computational Structures, Lecture Notes in Computer Science 1378, Springer
(1998) 140-155.
[14] Cardelli, L., Gordon, A.D.: Types for Mobile Ambients. Proc. of the 26th ACM Symposium
on Principles of Programming Languages (1999) 79-92.
[15] Carnero, N., Gelernter, D.: Linda in Context. Communications of the ACM, 32(4) (1989)
444-458.
[16] Carriero, N., Gelernter, D.,Zuck,L.: Bauhaus Linda. In: Ciancarini, P., Nierstrasz.,0., Yon-
ezawa, A. (eds.): Object-Based Models and Languages for Concurrent Systems. Lecture
Notes in Computer Science 924, Springer Veriag (1995) 66-76.
94
[17] Chandra, T.D., Toueg, S.: Unreliable Failure Detectors for Asynchronous Systems. ACM
Symposium on Principles of Distributed Computing (1991) 325-340.
[18] De Nicola, R., Ferrari, G.-L., Pugliese, R.: Locality Based Linda: Programming with Explic-
it Localities. Proc. TAPSOFT'97. Lecture Notes in Computer Science 1214, Springer Ver-
lag (1997) 712-726.
[19] Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of Distributed Consensus with
One Faulty Process. J.ACM 32(2) (1985) 374-382.
[20] Fournet, C , Gonthier, G.: The Reflexive CHAM and the Join-Calculus. Proc. 23rd Annual
ACM Symposium on Principles of Programming Languages (1996) 372-385.
[21 ] Fournet, C , Gonthier, G., Levy, J.-J., Maranget, L., Remy, D.: A Calculus of Mobile Agents.
Proc. 7th International Conference on Concurrency Theory (CONCUR'96) (1996) 406-421.
[22] Fournet, C , Maranget, L.: The Join-Calculus Language - Documentation and User's Guide.
<http://pauillac.inria.fr/join/> (1997).
[23] Goshng, J., Joy, B., Steele, G.: The Java Language Specification. Addison-Wesley (1996).
[24] Hoare, C.A.R., Communicating Sequential Processes. Communications of the ACM 21(8)
(1978)666-678.
[25] Honda., K., Tokoro, M.: An Object Calculus for Asynchronous Communication. Proc.
ECOOP'91, Lecture Notes in Computer Science 521, Springer Verlag (1991) 133-147.
[26] INMOS Ltd.: occam programming manual. Prentice Hall (1984).
[27] Kistler, T., Marais, J.: WebL - A Programming Language for the Web. In Computer Net-
works and ISDN Systems, 30. Elsevier (1998) 259-270.
[28] Milner, R.: A Calculus of Communicating Systems. Lecture Notes in Computer Science 92.
Springer Verlag (1980).
[29] Milner, R.: Functions as Processes. Mathematical Structures in Computer Science 2 (1992)
119-141.
[30] Milner, R., Parrow, J., Walker, D.: A Calculus of Mobile Processes, Parts 1-2. Information
and Computation, 100(1) (1992) 1-77.
[31] Morris, J.H.: Lambda-Calculus Models of Programming Languages. Ph.D. Thesis, MIT
(Dec 1968).
[32] Palamidessi, C : Comparing the Expressive Power of the Synchronous and the Asynchro-
nous Pi-calculus. Proc. 24th ACM Symposium on Principles of Programming Languages
(1997)256-265.
[33] Sander, A., Tschudin, C.F.: Towards Mobile Cryptography. ICSI technical report 97-049,
November 1997. Proc. IEEE Symposium on Security and Privacy (1998).
[34] Sangiorgi, D.: From 7i-calculus to Higher-Order 7i-calculus - and Back. Proc. TAPSOFT
'93., Lecture Notes in Computer Science 668, Springer Verlag (1992).
[35] Stamos, J.W., Gifford, D.K.: Remote Evaluation. ACM Transactions on Programming Lan-
guages and Systems 12(4) (1990) 537-565.
[36] Stein, L.A., Lieberman, H., Ungar, D.: A Shared View of Sharing: The treaty of Orlando. In:
Kim, W., Lochowsky, F. (eds.); Object-Oriented Concepts, Applications, and Databases.
Addison-Wesley (1988) 31-48.
[37] White, J.E.: Mobile Agents. In: Bradshaw, J. (ed.): Software Agents. AAAI Press / The MIT
Press (1996).
Type-Safe Execution of Mobile Agents
in Anonymous Networks
1 Introduction
In [6] we presented a type system for controlling the use of resources in a distributed
system. The type system guarantees that resource access is always safe, in the sense that,
for example, integer channels are always used with integers and boolean channels are
always used with booleans. The type system of [6], however, requires that all agents in
the system be well-typed. In open systems, such as the internet, such global properties
are impossible to verify. In this paper, we present a type system for partially typed
networks, where only a subset of agents are assumed to be well typed.
This notion of partial typing is presented using the language DTI, from [6]. In D7t
mobile agents are modeled as threads, using a thread language based on the 7i-calculus.
Threads are located, carrying out computations at locations or sites. Located threads,
or agents interact using local channels, or resources.
In an open system, not all sites are necessarily benign. Some sites may harbor mali-
cious agents that do not respect the typing rules laid down for the use of resources. For
example, consider the system
^|(vc:chan(int)) gom.a!(/c[c])|
\mial{z[x])goz.x\{x.)\
consisting of two sites k and m. The first generates a new local channel c for transmitting
integers and makes it known to the second site m, by sending it along the channel a local
to m. Here, the value k\c] and the pattern z\x\ are dependent pairs, where the first element
represents a location and the second element represents a resource at that location. In a
benign world k could assume that any mobile agent that subsequently migrates from m
to k would only use this new channel c to transmit integers at k. However in an insecure
world m may not play according to the rules; in our example it sends an agent to k which
misuses the new resource by sending the boolean value t along it.
In this paper we formalize one strategy that sites can use to protect themselves from
such attacks. The strategy makes no assumptions about the security of the underlying
network. For example, it is not assumed that the source of a message (or agent) can be
reliably determined. We refer to such networks as anonymous networks.
96
2 The Language
We present the syntax and standard semantics of DJI. For a full treatment of the lan-
guage, including many examples, see [6]. The language is a simplification and refine-
ment of that introduced in [11].
2.1 Syntax
The syntax is given in Table 1. We defer the discussion of types, T, to Section 2.3. The
syntax is parameterized with respect to the following syntactic sets, which we assume
to be disjoint:
- Var, of variables, ranged over by x-z,
- Name, of names, ranged over by a-m,
- Int, of integers, ranged over by /, and
- Bool = {t,f}, oibooleans, ranged over by bv.
We use u-w to range over the set of identifiers, Id = Var U Name. We typically use the
names a-d to refer to channels and k-m to refer to locations, although the distinction is
formally imposed by the type system. We use e to refer to names that might be of either
type. The main syntactic categories of the language are as follows:
- Threads, P-R, are (almost) terms of the ordinary polyadic Ti-calculus [8]. The thread
language includes the static combinators for composition '|' and typed restriction
97
Table 1 Syntax
Id: u,v,w e I X
Vai. U, V / I bv I M I H'[u,,..,M„] I ((7,, .., Un)
Pat: X, Y z[Xi. (A|, .., X„)
'(ve:T) ' , a s well as constructs for movement'go^./*', output ' M ! ( V ) / ' ' , typed input
'M?(X:T) Q\ (mis)matching 'if t/ = V then P else Q' and iteration '*P'.
- Agents, £lP}, are located threads.
- Systems, M-N, are collections of agents combined using the static combinators ' | '
and '(v^c:T)'.
The output and input constructs make use of syntactic categories for values, U-
V, and patterns, X-Y, respectively. Values include variables, names, base values, and
tuples of these. A value of the form w[ui,.., u„] includes a location w and a collection
of channels Uj,.., M„ allocated at w. Patterns, X-Y, provide destructors for each type.
To be well-formed, we require that patterns be linear, i.e. each variable appear at most
once.
As an example of a system, consider the term:
llPj\{via:A){em\klR])
This system contains three agents, £ | P ] , £|(2l and klRj. The first two agents are running
at location i, the third at location k. Moreover Q and R share a private channel a of type
A, allocated at I and unknown to P.
Unlike [3, 5], agents are relatively lightweight in DTC. They are single-threaded and
can be freely split and merged using structural rules and communication. As such, they
are unnamed.
- In the concrete syntax, "move" has greater binding power than composition. Thus
goi.P I Q should be read (go£.P) | Q. We adopt several standard abbreviations.
For example, we routinely drop type annotations when they are not of interest.
We omit trailing occurrences of stop and often denote tuples and other groups us-
ing a tilde. For example, we write a instead of (oi,.., a„) and {Ve:T)P instead of
(veiiT,) ..(ve„:T„)P. We also write 'if U =V then P' instead of 'if f/ = V then
P else stop' and 'if (/ 7^ V then Q' instead of '\f U = V then stop else Q.'
- We assume the standard notion of free and bound occurrences of variables and
names in systems and threads. The variables in the pattern X are bound by the input
construct u?{X) Q, the scope isQ. The name e is bound by the restrictions (v^e:T) N
and (ve:T) P, the scopes are N and P, respectively. A term with no free variables is
98
closed. The functions fn(A^) andfid(A') return respectively the sets of free names
and free identifiers occurring in A'.
- We also assume a standard notion substitution, where N^"/x^ denotes the capture-
avoiding substitution of u for x in N. The notation A'|7^|} generalizes this in an
obvious way as a sequence of substitutions. For example, A^II^'^j/zWU = ''*^iyzH"AI-
- In the sequel we identify terms up to renaming of bound identifiers. D
eigok.pj^kiPj
The monoid laws are: W10 =/V, iV IM = A/! W, and A'I (WIO) = (/VIM) I O.
99
which states that an agent located at £ can move to k using the move operator gok.P.
This is the only rule that varies significantly between the standard semantics and the
semantics for open systems defined later. Note that in (r-comm), communication is
purely local:
4a!(v)p] I eiaiix)Qj^m \ mvm
As an example, suppose that we wish to write a system of two agents, one at k and
one at I. The agent at k wishes to send a fresh integer channel a, located at k, to the
other agent using the channel b, located at £. This system could be written:
Location types are essentially the same as standard record types, and we identify
location types up to reordering of their "fields". Thus loc{a:A,fe:B}[C]= \oc{b:B,
a:A}[C]. But reordering is not allowed on "abstract" fields. Thus if B and C are dif-
ferent, then loc{fl:A}[B,C] yi loc{a:A}[C,B].
Throughout the text, we drop empty braces when clear from context, writing 'loc'
instead of 'loc{ }[]', 'K' instead of 'K[]', and '«' instead of'«[]'.
The subtyping relation (T < S) is discussed at length in [6]. On base types and
channel types there is no nontrivial subtyping; for example, chan(T) < chan(T') if and
only if T = T'. On location types (both simple and "existential"), the subtyping relation
is similar to that traditionally defined for record or object types:
{
k : loc {a:chan(int),jc:int}
^ .|ocf<^:chan(loc[chan(int)]) "i
\3':chan{loc[chan(bool)]) /
Here we have two locations, k and z- The first has an integer channel named a and an
integer variable x. The second has two channels: a, which communicates (potentially
remote) integer channels and y which communicates (potentially remote) boolean chan-
nels.
101
Therefore the extension of L by V-.T adds only information about local identifiers to L.
For example:
loc{rf:D},(0,A:,z[>']):(int,A,loc{c:C}[B]) = loc{<i:D,x:A}
Note that every location type L is also a v-open location type, and thus this definition
applies to "closed" location types as well. In the same way the definition applies to
patterns, as well as values, since syntactically every pattern is also a value.
We use a similar notation for extending type environments: If M is not in the domain
of r then r, M:L denotes the new type environment, which is similar to T but in addition
maps u to type L.
where A is a closed type environment and A' and A'' are systems.
The reduction semantics is given in Table 3. Most of the rules are simple adaptations
of the corresponding rules in Table 2. For example, the rules for local communication
and matching of values are essentially as before, as A is not consulted for these reduc-
tions. There is a minor change in the rule for the restriction operator, because A must be
augmented to reflect the addition of the new name.
102
The only significant change from the standard run-time semantics is in the rule for
code movement:
A > ilgok. Pj h-^- klPj if A(^) h P
This says that the agent P can move from location i to location k only if P is guaranteed
not to misuse the local resources of ^, i.e. A{k) h f. Here P is type-checked dynamically
against A{k), which gives the names and types of the resources available at k.
The definition of this runtime local type-checking is given in Table 4. This is a light
weight typing in that the incoming code is only checked to the extent of its references
to local resources. Thus judgments are of the forni
L \-P
indicating that P can safely run at a location that provides resources as defined in L.
Perhaps the most surprising rule in this light weight type checking is (t-move),
which involves no type checking whatsoever. However this is reasonable as an agent
such as go£.P running at k uses no local resources; it moves immediately to the site £.
As a result of this rule notice that reductions of the form
Ll-P
t-movej (t-newl)
L \-gou.P LI-(v/t:K)/'
L h «:chan(T) L, A':T h Q L, a:A h P
(t-r) t-newc
LI-M?(X:T)2 LI-(va:A)P
LI-M:chan(T>, V : T , / ' LhP.G
(t-w) t-str
h\-u\{y)P L h stop, *P, PIG
L 1- /7:T, V:T, P, Q
(t-eq)
Lf-if ( / = K then P else g
Once more there is a subtlety, this time in the local type checking of values. If the value
V to be transmitted is a local resource, say a channel name b, then according to the rule
(t-sit) b must have the local type T. If, on the other hand, V is a non-local value, say
k[b], then locally this is of no interest; according to (t-loc) k.[b] can be assigned any
location type which in effect means that when it is transmitted locally on a its validity
is not checked.
This ends our discussion of runtime local type checking, and of the runtime seman-
tics.
104
3.2 An Example
As an example consider a system of three locations, k, i and m, with the following
distributed type environment, A.
{
k : loc{a:chan(int)}
i : loc{i>:chan(loc[chan(bool)])}
m : loc{(3f:chan(loc[chan{bool)])}
klgom.d\{k[a])j
\mldnz[x]) goLb\{z[x])j
I ilb?iz[x])goz.x\{t)j
Here k communicates the name of its integer channel a to m, using the channel d local
to m. Then m misinforms i about the type of a at t. the communication along b fools (.
into believing that a is a boolean channel. Subsequently i attempts to send an agent to
k that violates the type of local resource a, by sending a boolean value where an integer
is expected.
The reader can check that according to our runtime semantics the first code move-
ment between k and m is allowed:
A> klgom.d]{k[a])}^^ mld\{k[a\)j
as local type checking of the migrating agent succeeds, A(m) h d\{k[a]).The local chan-
nel d is used correctly and since the value transmitted, k[a], is non-local it is essentially
not examined (only the number of names is checked, not their types).
The local communication at m on channel d now occurs and the second code move-
ment between m and £ is also allowed,
because the migrating thread, b\{k[a\), is also successful in its type check against local
resources, A{£). The local communication along b now occurs
However the next potential move, the migration of the thread a! (t) from £ to /:, is disal-
lowed by the rule (r-move); the thread is locally type checked against the resources at
k, where a is known to be an integer channel, and its potential misuse is discovered.
4 Static typing
In the runtime semantics misuse of local resources can certainly occur, since there is no
requirement that values have the object type specified by the transmitting channel. For
example the reduction
is allowed even if the object type of the channel a at ^ is int, i.e. A{k) < loc{a:chan(int)}.
We do not assume that all sites respect the typing constraints on their channels. For
convenience let us call sites that violate the typing constraints bad sites, as they do not
play according to the rules. A typical example is the site m, described at the end of
the previous section; it receives an integer channel a from k but then attempts to use
a to send a boolean value. In contrast a good site is one where typing constraints are
enforced.
In this section we present a static type system that guarantees that:
good sites cannot be harmed by had sites.
That is, local resources at a good site cannot be misused despite the existence of, and in-
teraction with, bad sites. We prove Subject Reduction and Type Safety theorems for the
type system. Intuitively, Subject Reduction can be interpreted as saying that the integrity
of good sites is maintained as computation proceeds, while Type Safety demonstrates
that local resources at good sites cannot be misused.
The static typing relation for anonymous networks is defined in Table 5. Judgments
are of the form
ri-A'
where F is a (v-open) type environment and A' a system. The type environment only
records the types of good locations; thus k £ dom(r) is to be read "k is good" and
m ^ dom(r) may be read as "m is bad." If F h N, then those agents in A' that are located
at sites in the domain of F are guaranteed to be "well behaved".
For threads and values, the static typing relation is the same as runtime typing rela-
tion given in Table 4. The typing of a located agent £1^1 depends on whether £ is good.
106
„ _ J ^ : loc{a:chan(int)} 1 ,^,
\ £ : loc{Z7:chan(loc[chan(bool)])} J ^'
Then one can easily check that F h A > A', that is A is consistent with F and F h A^.
Intuitively here we are saying that k and £ are good sites which use their local resources
correctly whereas no guarantee is made about the local behavior at m. Note, however,
that if one considers static typing under A, which includes m, then A does not type A'.
The static typing system satisfies several important properties, such as type special-
ization and narrowing, which are stated and proved in Appendix A. These properties
are used to establish the following result, which states that well-typing at good sites is
preserved by reduction.
THEOREM 1 (SUBJECT REDUCTION).
IfT Y- A> N and A> N ^^ N' thenV \- A\> N'.
Proof. See Appendix A. D
We now discuss the extent to which in the type system precludes the misuse of lo-
cal resources at good sites. This can be formalized using a notion of runtime error. In
this paper, we confine our attention to runtime errors based on arity mismatching. Intu-
itively, an arity mismatch occurs when the value sent on a channel does not match the
type that the recipient expects, or when two structurally dissimilar values are compared
using the match construct. To formalize this notion, we define a compatibility relation
(/ X T between (closed) values and types, and a compatibility relation [/ x V between
(closed) values.
The definitions of compatibility and of runtime error are given in Table 6. Runtime
error is defined as a predicate A' - ^ ^ , indicating that A' is capable of a runtime error at
location £. Essentially an error occurs at location £ if either two incompatible values are
compared at £ or an attempt is made to communicate a value along a local channel which
is incompatible with the type of the channel. The only non-trivial rule is (e-new) which
107
may effect a change in the name of the location where the error occurs. For example it
will report an error at £ in {vit.K) N if there is a runtime error in A' at location k.
We should point out that this definition of run-time error is considerably weaker
than that employed in [6]; in that paper, the notion of run-time error took into account
not only to arity mismatches but also access violations.
THEOREM 2 (TYPE SAFETY). ifT \-N andi£ dom(r) then N ^^^.
Proof. See Appendix A. D
This theorem, together with Subject Reduction, can be interpreted informally as saying
that as reductions proceed local resources cannot be locally misused at good sites, even
in systems where not all sites which are necessarily well-behaved.
5 Discussion
Here we make three points which demonstrate the limitations of both the runtime strat-
egy and the typing system.
First the static typing is very weak as it is designed only to eliminate local misuse
of local resources. Let
{
„ I r/7:chan(loc{J:chan(bool)}) "1
I c:chan(loc{(f:chan(int)}) J
/::loc{<i:chan(int}}
although one could argue that at location I there is a misuse of the channel b. A runtime
error is avoided by the dynamic type-checking; after the communications on c and b the
potential move
ilgok.d\{t)]^kld\{t)j
is blocked because the agent attempting to move to k, d\{t) does not type check against
the local resources at k.
Second, the requirement to dynamically type-check all incoming threads is very
inefficient; however, the weak typing system makes it is essential. Purely local type-
checking makes it very difficult to introduce trust into the system. A site cannot even
trust itself! For example suppose we revised the reduction rule (r-move) to read as fol-
lows:
A > llgok. P] h-> A > klPj if k = E or A{k) h P
Here the site k trusts itself and therefore does not type check the thread P. However
this rule is not safe; Subject Reduction fails and runtime errors may be introduced. As
an example, consider the following configurafion, which uses the typing environment F
given at the beginning of this section:
F^4c?(z)goz.d(t)l \eic\m
The configuration can be typed with respect to F itself, but after the communication
the result is F o £|c!(t)|, which fails to type under F. Moreover F > £|c!(t)] induces
a runtime error due to the potential misuse of channel c. A related phenomenon is the
potential misuse of channel names as locations; e.g. the thread '(va:chan{int))goa./''
is typable in our system.
Third, the system relies heavily on dynamic type-checking to avoid misuse of re-
sources. This can be highlighted by considering another desirable property of a runtime
semantics for open systems:
Movement between good sites should always be allowed, even in the presence
of badly behaved sites.
It is obvious from the example discussed in Section 3.2 that our run-time semantics does
not satisfy this property. If we use the static environment F defined in (*) (Section 4)
then k and £ are good sites. But as we have seen in Section 3.2 the intervention of m
eventually prevents a movement from k and £.
In a companion paper [12] we address these issues by strengthening the typing sys-
tem. The second concern raised here (the inability to trust oneself or to ensure that
channels are not used as locations) can be addressed by reformalizing the typing sys-
tem while retaining its basic "local" character; such a reformalization is given in the
appendix of [12].
In [12], we go further, however, extending the notion of trust to arbitrary collec-
tions of sites. Dynamic typechecking is strengthened by making a site record informa-
tion about all sites in the system, not only itself. While typechecking in this system is
computationally more expensive, not all incoming threads need be checked; those orig-
inating at trusted sites are allowed through unchecked. This new semantics addresses
the first two of the above concerns fully, and the third partially: a stronger notion of
109
type safety is guaranteed, some incoming threads are not typecheciced, and movement
between mutually trusted sites is always allowed, although movement between good
sites may not be.
6 Related Work
In this paper we have outlined a strategy for ensuring that the integrity of well-behaved
sites is not compromised by the presence of potentially malicious mobile agents. More-
over we have formalized the correctness of this strategy in terms of Subject Reduction
and Type Safety theorems for a partial type system.
In this study we used the language Dn [6], one of a number of distributed versions of
the 7t-calculus [8]. For other variations see [5, 13]. The languages in [3, 4] are themati-
cally similar although based on somewhat different principles. We have taken advantage
of a rich type system for D7t, originally presented in [6], where not only do channels
have the types originally proposed in [10] for the Ji-calculus, but locations have types
broadly similar to those of objects. An even richer type system is also proposed in [6]
in which types correspond to capabilities, as in [4], and an interesting topic for future
research would be the extension of partial typing to these richer types.
Our research is related to proposals for proof-carrying code outlined in [9]: code
consumers, which in our case are locations, demand of code producers, in our case
incoming threads, that their code is accompanied by a proof of correctness. This proof
is checked by the consumer before the code is allowed to execute. The correctness
is expressed in terms of a public safety policy announced by the consumer and the
producer must provide code along with a proof that it satisfies this policy. In our case
this safety policy is determined by the location type which records the types of the
consumer's resources, and proof checking corresponds to type checking the incoming
code against this record. Our work is different in that the correctness proof can be
reconstructed efficiently, and therefore the producer need not supply an explicit proof.
For other examples of related work within this framework see [7, 14]. For example
the former contains a number of schemes for typechecking incoming code for access
violations to local private resources. However the language is very different from ours,
namely a sequential higher-order functional language, and there is no direct formaliza-
don of the fact that distributed systems which employ these schemes are well-behaved.
A very different approach to system security is based on the use of cryptography
and signatures. For example [1] presents a 7i-calculus based language which contain
cryptographic constructs which ensure the exchange of data between trusted agents,
while [2] contains a description of the application of this approach in a practical setting.
A Proofs
Proof. By induction on the judgment L h V:T. If V:T takes the form VrH then S must
coincide with H, since there is no non-trivial subtyping on channel types or base types.
If V:T has the form H'[M]:L[A] then the result is trivial, using (t-loc). Finally, the case
for tuples follows by induction. D
PROPOSITION 4 (WEAKENING).
- / / L h V:T andK<l. then IK h V:T
- / / L h P and K < L then K\- P
- Ifr, w.L \-NandK<L then r,w:K\-N
Proof. In each case the proof is by induction on the type inference. We examine two
examples of proof on threads:
(t-r). Here L h «?(X:T)Q because L I- u;chan(T) and L, X:T I- Q. We can apply the
first statement in the proposition to the former, to obtain K h M:chan(T), while
induction to the latter gives K, X:J h Q. An application of (t-r) now gives the
required K h M?(X:T) Q.
(t-newc). Here L I- {va:A)P because L, a.A \- P. By a-conversion we can choose a
so that it does not appear in K and therefore by induction we have K, a.A h P. Now
an application of (t-newc) gives the required K I- {va:A)P.
We present four cases for the proof on systems.
(t-rung). Here r,w:L (- m[P] because M h P, where M = (r,H':L)(TO). If m and w
are different then we also have M = (r,H':K)(m) and therefore an application of
(t-rung) gives the required FiVvrK h m|P|. On the other hand if m is the same as
w then M = L. So we can apply the second part of the proposition to M, obtaining
K h P. Now (t-rung) also gives the required F, w.K h m|P].
(t-runb). This case is trivial.
(t-newig). Here T, w:L h (v^m:M) W because £ e dom(r, w.h) and F, w:L, m:M H A^.
Applying induction we obtain F, w.K, m:M h A'. Now (t-newIg) can be applied
since i £ dom(F, w.K), to obtain the required F, w.K h {V(m:M)N.
(t-newcb). Here r,w.h h {V(a:A)N because £ ^ dom(F,w;L) and F,w:L h A^. How-
ever we also have £ ^ dom(F, w.K) and therefore (t-newcb) can also be applied to
obtain the required FjWiK h (V(a:A)N. D
The following Restriction Lemma states that if F h A' and some identifier u does
not occur free in A' then N can also be typed in an environment obtained from F by
removing all occurrences of «. For any identifier u let F\M denote the result of removing
all occurrences of u from F. For example (F, u:L) \M denotes F while (F, w. (L, u:A))\u
is the same as (F\«), w.h.
LEMMA 5 (RESTRICTION).
- If h,v:H h U:Tandv^ t\d{U) then L h t/:T
- If h,v:H \-Pand V ^ M(P) then L h P
- Ifr \-N and v^M(N) then r\v\-N.
Proof. By induction on the proof of the typing judgment. D
The following corollary follows by an easy induction on V.
111
COROLLARY 6.
We first show that typing is preserved by the structural equivalence. The most compli-
cated case is already covered by the Restriction proposition.
Proof. There are two cases. If £ ^ dom(r), then we can reason as follows:
In the case that (. € dom(r), the argument is slightly different depending on whether
e is a channel or a location. As an example we consider the former, and we assume T
has the form A,£;L.
As is normally the case the proof of Subject Reduction depends on the fact that, in some
sense, typing is preserved by substitution. To prove this fact the following lemma will
be useful:
some S for which we also have r(£) h M:chan(S). In our typing system this must
mean that S and T coincide. We may therefore apply the Substitution lemma to
obtain the required r(^) h Q{|7^|}-
(r-new). We consider the case:
^,^•.\^>{vta•.A)N-^{V(a:A)N' because A, £:(L, a:A) t> A^-^^ A''
First suppose £ € dom(r). Since F h A, £:L o {vea:A)N we know F can be written
as F', i:L', where L' < L and therefore F', £:(L', a:A) h N. We can now apply in-
duction to obtain V, £:(L', a:A) h A'', to which (t-newcg) can be applied to obtain
the required F h {V(a:A)N'.
If £ ^ dom(F) then by (t-newcb) it is sufficient to prove F h A^'. In this case F I-
A, £:L > {\i(a:A)N yields F h A, £:(L, a:A) > N to which induction can be applied
to give the required T \- N'.
(r-str). This case follows using induction and Proposition 8. D
Acknowledgements
We thank the referees for several comments that sharpened the presentation. Matthew
Hennessy funded by CONFER II and EPSRC project GR/K60701. James Riely funded
by NSF grant EIA-9805604.
References
[1] M. Abadi and A.D. Gordon. A calculus for cryptographic protocols: The spi calculus.
Information and Computation, To appear. Available as Compaq SRC Research Report 149
(1998).
[2] B. Bershad, B. Savage, R Pardyak, E. Sirer, D. Becker, M. Fiuczynski, C. Chambers, and
S. Eggers. Extensibility, safety and performance in the SPIN operating system. In Sympo-
sium on Operating Systems Principles, pages 267-284, 1997.
[3] L. Cardelli and A.D. Gordon. Mobile ambients. In Foundations of Software Science and
Computational Structures, volume 1378 of Lecture Notes in Computer Science, pages 140-
155. BerHn: Springer-Verlag, 1998.
[4] R. DeNicola, G. Ferrari, and R. Pugliese. Types as specifications of access policies. In
J. Vitek and C. Jensen, editors, Secure Internet Programming: Security Issues for Dis-
tributed and Mobile Objects, Lecture Notes in Computer Science. Springer-Verlag, 1999.
[5] C. Fournet, G. Gonthier, J.J. Levy, L. Marganget, and D. Remy. A calculus of mobile
agents. In U. Montanari and V. Sassone, editors, CONCUR: Proceedings of the Interna-
tional Conference on Concurrency Theory, volume 1119 of Lecture Notes in Computer
Science, pages 406-421, Pisa, August 1996. Berlin: Springer-Verlag.
[6] M. Hennessy and J. Riely. Resource access control in systems of mobile agents. In
U. Nestmann and B. Pierce, editors, 3rd International Workshop on High-Level Con-
current Languages (HLCL'98), volume 16(3) of Electronic Notes in Theoretical Com-
puter Science, Nice, September 1998. Elsevier. Available from h t t p : / / w w w . e l s e v i e r .
n l / l o c a t e / e n t c s . Full version available as Sussex CSTR 98/02, 1998. Available from
http://www.cogs.susx.ac.uk/.
[7] X. Leroy and F. Rouaix. Security properties of typed applets. In Conference Record of
the ACM Symposium on Principles of Programming Languages, San Diego, January 1998.
ACM Press.
[8] R. Milner The polyadic 7i-calculus: a tutorial. Technical Report ECS-LFCS-91-180, Labo-
ratory for Foundations of Computer Science, Department of Computer Science, University
of Edinburgh, UK, October 1991. Also in Logic and Algebra of Specification, ed. F. L.
Bauer, W. Brauer and H. Schwichtenberg, Springer-Verlag, 1993.
[9] G. Necula. Proof-carrying code. \n Conference Record of the ACM Symposium on Princi-
ples of Programming Languages, Paris, January 1997. ACM Press.
[10] B. Pierce and D. Sangiorgi. Typing and subtyping for mobile processes. Mathematical
Structures in Computer Science, 6(5):409-454, 1996. Extended abstract in LICS '93.
[11] J. Riely and M. Hennessy. A typed language for distributed mobile processes. In Confer-
ence Record of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Program-
ming Languages, San Diego, January 1998. ACM Press.
[12] J. Riely and M. Hennessy. Trust and partial typing in open systems of mobile agents.
In Conference Record of the 26th ACM SIGPLAN-SIGACT Symposium on Principles of
Programming Languages, San Antonio, January 1999. ACM Press. Full version available
as Sussex CSTR 98/04, 1998. Available from h t t p : //www. cogs. susx. ac. uk/.
115
[13] P. Sewell. Global/local subtyping and capability inference for a distributed 7t-calculus. In
Proceedings of the International Colloquium on Automata, Languages and Programming,
volume 1433 of Lecture Notes in Computer Science. Berlin: Springer-Verlag, July 1998.
[14] R. Stata and M. Abadi. A type system for Java bytecode subroutines. In Conference Record
of the ACM Symposium on Principles of Programming Languages, San Diego, January
1998. ACM Press.
Types as Specifications of Access Policies
1 Introduction
Most of the difficult issues to face when developing mobile applications running
over a network are related to security. A typical example of security property
is the requirement that only legitimate mobile agents can be granted access to
specific resources, or to specific services. A common solution to this problem is
to provide secure communication channels by means of cryptographic protocols.
These protocols (e.g. SSL, S-HTTP) are designed to provide authentication
facilities of both servers and agents. Another common approach relies on security
architectures which monitor the execution of mobile agents to protect a host from
external attacks to private information.
Recently several researchers have explored the possibility of considering se-
curity issues at the level of language design aiming at embedding protection
mechanisms in the languages. For instance the language Java [4] exploits type
information as a foundation of its security: well-typed Java programs (and the
corresponding verified bytecode) will never compromise the integrity of certain
data.
This idea dates back in time; type systems have been successfully used to
ensure type safety of programs since a long time. Type safety means that every
data will be used consistently with its declaration. However, there has been little
work on exploring and designing type systems for security.
In this paper we discuss the design of the type system for KLAIM (a Kernel
Language for Agents Interaction and Mobility) [14], an experimental program-
ming language specifically designed for programming mobile agents. KLAIM pro-
vides direct support for expressing and enforcing security policies that control
access to resources and data. In particular, the language uses types to protect
resources and data and to estabhsh policies for access control.
The main guidelines for the design of KLAIM and of its type system are:
- KLAIM processes and types are network aware;
118
Nets are sets of nodes; each node consists of a site s, an allocation environment
p, a set of running processes P and a tuple space T. We formally model T as
a special kind of process. The allocation environment p constraints network
connectivity in that a node will be able to communicate only with the subset of
the nodes of the network determined by p. Hereafter, we will use
si ::„ Pi I Ti II ... II s„ :>„ Pn \ T„
to denote a net with n nodes.
Processes are the active computational units and may be executed concur-
rently either at the same site or at different sites. Processes can perform five
different basic operations, called actions, that permit reading (writing) from
(in) a tuple space, activating new threads of execution and creating new nodes.
The operation for retrieving information from a node has two variants:
\n{t)@l and read(f)@;. Action \n{t)@l evaluates the tuple t and looks for a
matching tuple t' in the tuple space located at / [l gives the logical address of
the tuple space). Whenever the matching tuple t' is found, it is removed from
the tuple space. The corresponding values of t' are assigned to the variables in
the formal fields of t and the operation terminates; the new bindings are used
by the continuation of the process that has executed in(i)@/. If no matching
tuple is found, the operation is suspended until one becomes available. Action
read(f)@Z differs from in(t)@/ because the tuple t' selected by pattern-matching
is not removed from the tuple space.
The operation for placing information on a node has, again, two variants:
out(t)@/ and eval(P)@/. The operation out(t)@Z adds the tuple resulting from
the evaluation of t to the tuple space located at /. The operation eval(P)@/
spawns a process (whose code is given by P) at the node located at /.
The operation for creating new nodes is n e w l o c ( u i , . . . ,Wn)- It dynamically
creates a set of n different new sites that can only be accessed via locality
variables u i , . . . ,u„.
We now provide some simple examples of KLAIM programs; more advanced
programming examples will be presented later. Hereafter, we assume that the
basic values are i n t e g e r and s t r i n g s .
Our first example illustrates a process that moves along the nodes of a net
with a fixed binding of localities to sites (this somehow corresponds to static
scoping). We consider a net consisting of two sites si and S2- A client process C
is allocated at site Si and a server process S is allocated at site S2. The server
S can accept clients for execution. The client process sends process Q to the
server. This is modelled by the following KLAIM code:
C''=^out(g)@/i.nil
Q '^^ in(7oo", !a;)@self.out(7oo",a; + l)@self .nil
S ''= in(!X)@self .X
The behaviour of the processes above depends on the meaning of /j and self.
It is the allocation environment that establishes the links between localities and
sites. Here, we assume that the allocation environment of site Si, pi, maps s e l f
121
into si and li into S2, while the allocation environment of site S2, p2, maps
self into S2- Finally, we assume that the tuple spaces located at si and S2 both
contain the tuple ("/oo",l). The following KLAIM program represents the net
discussed above:
si ::,, C|out("/oo",l) || s^ ::,, 5|out("/oo", 1).
The client process C sends process Q for execution at the server node (locality
li is bound to S2 in pi). After the execution of out{Q)@li, the tuple space at
site S2 contains a tuple where the code of process Q is stored. Indeed, it is
Q' '^=^ in{"foo", lx)@si.out{"f 00",x + l)@si.nil
the process stored in the tuple, as the locahties occurring in Q are evaluated using
the environment at site si where the action out has been executed. Hence, when
executed at the server's site the mobile process Q increases tuple "foo" at the
client's site. Fig. 1 gives a pictorial representation of this example.
CLIENT SERVER
\C{"foo",l)\pi I|S(700",1)P2||
I
nil {"foo",l) pi [S{"foo'\l), Q'p2\
i
|nill(7oo",l)pi \Q' {"foo",l)p2\
i
|nil|(7oo",2)pi llnil (7oo'M) P2
Our second example illustrates how mobile agents migrate with a dynamic
scoping strategy. In this case the client process C is eval(Q)@/i.nil. When
eval{Q)@li is executed, the process Q is spawned at the remote node with-
out evaluating its locahties according to the allocation environment pi. Thus,
the execution of Q will depend only on the allocation environment p2 and Q will
increase tuple "foo" at the server's site. Fig. 2 illustrates this example.
CLIENT SERVER
\\C'{"foo",l)pi\ ||nil|(7oo'M)|p2|
|lnil|(7oo",l)lpi \\Q{"foo",l)p2\
||nil|(7oo",l)|pi ||nil|(7oo",2)p2|l
first accesses the tuple space located at Is to read an address and to assign it
to the locality variable u, then sends process P for execution at u. The type
specification ([self i-> {e}],(5p) expresses that Client is looking for a locality
where it is possible to send for execution a process with type Sp from the site
where Client is running.
Let us now consider a net where Server is allocated at site s and the two,
identical, processes Client are at sites si and S2, where 1$ is bound to s to allow
clients to interact with the server.
Process Client has type
This section reviews the syntax and the operational semantics of KLAIM. It may
be skipped by readers interested mainly in the programming issues.
124
3.1 Processes
The syntax of KLAIM terms is given in Table 1; there, P ranges over process
terms, a over actions, t over tuples, / over tuple fields, e over value expressions,
and £ over localities and locality variables. Process and locality variables are
typed whenever they are bound; value variables are kept untyped. We use ~ to
denote a sequence of objects and ( ^ to denote the set of objects in ~ Tuples
are sequences of fields, hence, for a tuple t, {t} will denote the set of fields of t.
Pairs of the form (A, 6) consisting of access lists A and types S will be called type
specifications. Their precise syntax will be introduced in the next section.
1 ^ (process variable)
A{P,i,^ (process invocation)
t :=/ 1 /,*
f : = e 1 P \ e •.{X,6) 1 \x 1 !X-.6 \ \u:{X,S)
Actions have been described in Section 2.1. Here we only want to add a few
comments on the action newloc(u : (A,(5)), that dynamically creates a set of
"fresh" nodes together with their access paths u. It is the only action not indexed
with a locality; it is always executed at the current node. For each Ui £ u, \i
specifies the access rights of the nodes of the net with respect to the new node
Ui, symmetrically for S^. The simultaneous creation of a set of new nodes allows
writing mutually recursive type specifications for these nodes.
Variables occurring in KLAIM terms can be bound by prefixes and process
definitions. More precisely, prefixes in{t)@£.. and read(i)@£._ act as binders for
variables in the formal fields of t. Prefix newloc(u : {X,S)).. binds the locality
variables u. Definition A{X : 6,u : {X,d),x) — P is a binder for the variables
{X,u,x\. Hereafter, we shall assume that all bound names in processes are
distinct and shall require that the arguments of eval operations do not contain
free process variables.
We will use the standard notation P[e/x] to indicate the substitution of the
value expression e for the free occurrences of the variable x in P; P[e/x] will
denote the simultaneous substitution of any free occurrence of a; € {x} with the
corresponding e e {e} in P. When substitutions involve locahty variables, e.g.
Hke in P[E/u], they have to be applied also to the type specifications therein.
Notation P[P/X,l/u,e/x\ has the expected meaning.
125
3.2 Types
Capabilities are elements of {r,i,o,e,n], where each symbol stands for the op-
eration whose name begins with it; r denotes the capability of executing a read
operation, i the capability of executing an in operation, and so on. We use 11,
ranged over by TT, to denote the set of non-empty subsets of [r, i, o, e, n}.
Access lists, ranged over by A, are lists [li y-^ TTi]i=i,...,n, where ii are all
distinct. Semantically, access lists A are partial functions from localities and
locality variables to sets of capabilities (i.e. X{ii) = TTJ). The access list of i,
[ti i-> ni]i=i^,,,^n, specifies the capabilities of £,; (i = 1 , . . . ,n) relatively to £.
The syntax of KLAIM types is given in Table 2; there v ranges over type vari-
ables and /x denotes the recursive operator. Hereafter, the following notational
convention will be used: ' W binds stronger than "^", that binds stronger than
",". variables. The type J_ denotes "void", i.e. no intention is declared by the
process, and, semantically, corresponds to the smallest type. Conversely, the type
T denotes the intention of performing any kind of operations and is the greatest
type. A type of the form £ H-> TT i-> (5 describes the intention of performing at t.
those actions allowed by TT, moreover it imposes constraint 6 on the processes
that could possibly be executed at £ (if e ^ TT then (5 is J.). The type 81,62 is the
union of types 61 and ^2; semantically, it is their least upper bound. Recursive
types are used for typing migrating recursive processes.
A type 5 generated from the grammar in Table 2 is such that any recursive
type ^JLV.6' occurring in 5 does not contain v on the left of i->. A consequence
126
3.3 Nets
KLAIM nets are collections of nodes where processes and tuple spaces can be
allocated. A node is a 4-tuple {s,Ps,6s,Ps) where s is a site, P^ is the process
located at s, (5s is the type of s specifying the access control policy of s, and ps is
the allocation environment of s, i.e. a (partial) function from localities to sites.
We write s ::*^ Ps to denote the node {s,Ps,Ss,ps)-
Hereafter, £ will denote the set of environments, (p the empty environment,
and {s/l} the environment that maps the locality I on the site s. We will use
£{p} to denote p{E), if p{£) is defined, and £, otherwise; moreover, p[s/i] will
denote the environment p' such that p'{£) = s and p'(i') = p{i') for £' ^ £.
To specify the mutual access policies of a set of nodes, hence to consistently
assign types to sites/nodes, we make use of a partial function A that, for each
site s, describes the access rights of s on the other sites. Types of nodes have
the same syntax of types of processes. However, strictly speaking, the formers
cannot be generated by the grammar given in Table 2, since we required £ to
stand for localities and locality variables. For types of nodes, we let £ range over
sites.
T a b l e 3 . Net Syntax
evaluate the tuple. If the tuple contains a field with a process, the corresponding
field of the evaluated tuple contains the process resulting from the evaluation
of the used localities by using the local allocation environment. The fact that
processes in tuples are transmitted after the interpretation of their localities
corresponds to having a static scoping discipline for the generation of tuples.
A dynamic scoping strategy is adopted for the eval operation. In this case the
localities of the spawned process are not interpreted using the local allocation
environment. Processes not closures are transmitted and their execution can be
influenced by the remote allocation environment.
A process can perform an in action by synchronizing with (a process rep-
resenting) a matching tuple et. To match the two candidate tuples one has to
consider the site where the operation is executed and the type interpretation
function of the net. The result of the execution of an in action is that tuple et
is withdrawn and its values are used to determine the values of the variables in
the input tuple to be used by the continuation. Action read behaves similarly
to in but leaves the matched tuple et in the tuple space.
The use of newloc leads to the creation of new nodes and modifies the
topology and the types of the net. The allocation environment of the new nodes
is derived from that of the creating node with the obvious update for the s e l f
locality. The type specifications are exploited to generate the new types. The
types of the nodes of the net are "extended" by adding the rights of the existing
nodes over the new ones and the rights of the new nodes over the existing ones.
Access lists are used to enrich the rights of the existing nodes, while types are
used to determine the rights of the new ones.
Pattern-matching is extensively used in the reduction semantics. To select
(from a tuple space) a tuple containing actual fields with processes inside the
pattern-matching operation checks the types of processes to ensure that they
satisfy the type constraints specified by the programmer. Hence, process codes
are checked before being downloaded. In other words, the pattern-matching
operation performs a run-time type checking of incoming codes.
^ Here, localities h and Z2 are used to represent the addresses of two nodes different
from the node where user process is allocated. Processes located at these nodes are
allowed to perform only output actions at the node named by UR. The use of Zi and
I2 could be avoided by introducing in the type specifications a distinguished locality
others to denote any node other than those explicitly mentioned.
130
S l-> {r} 1-4 ±, So •->• {o} 1-4 ±, Si 1-4 {o} 1-4 ±, . . . S„ 1-4 {o} H4 ± ,
where Si, i = 0,.. .,n are the sites with the rights of invoking server's faciUties,
and s is the server's site.
Notice, however, that this does not prevent P from visiting other sites. In
particular, agent P may be programmed in such a way that after having per-
formed the required elaboration, it transmits code Q at the locality /j (the logical
name of site Sj):
Restricting Interactions KLAIM action primitives operate on the whole net: oper-
ations on tuple spaces are not forced to happen locally at their current site. Prom
the point of view of access control policies, communications among different sites
of the net (i.e. remote communications) could be controlled and regulated.
To force a process running on a certain site s to access only local tuples it
suffices to constrain the type 6s of the site in such a way that for no s' ^ s it holds
that s' 1-4 {r} H4 ± :^ i5s. Hence, a process P allocated on s performing remote
read/in operations violate the access rights. To access tuples at a remote tuple
space, a well-typed process must first move (if it has the required rights) to the
132
remote site. Also output actions can be forced to be local; it suffices requiring
that for no s' ^ s it holds that s' i-> {o} y-^ ± < dg.
Fares and Tickets A primary access control policy consists of controlling the
route of a mobile agent traveling in the net. For instance, if one has to configure
a set of sites with new software, a mobile agent can be programmed to travel
among the sites to install the new release of the software.
If the starting site of the trip is site SQ and sites si, S2,..., Sn are visited
before coming back to the starting site, then the following equalities specify the
type 6T (the access rights) of the trip:
do = si \-^ {e} I-)- (5i,(5
(5i = S2 M- {e} H-)- 82,5
The idea is that at each site type <5 specifies the allowed operations (e.g. installing
the new release of a software package); the remaining type information specifies
the structure of the trip (which is the next site of the trip). At the last site of
the trip the agent has the rights of returning to the original site the results of
the trip (e.g. the notification that the installation was successful).
The type discussed above can be properly interpreted as the fare of the trip:
an agent A can perform the trip provided that its type 5A matches the fare:
formally, 6A when interpreted at site SQ is a subtype of ST- In this case the agent
A has the ticket for the trip. Notice that this ensures that a mahcious agent
cannot modify the itinerary of the trip to visit other sites different from those
listed in its ticket.
{l)Su 62 = 62 61
(2) {61,62),63 = 61,(62,63)
(3)6,6 = 6
(A) 1,6 = 6
(5)T,<5 = T
( 6 ) £ l-> TTl t-^ J i , £ i-> 7r2 !->• 62 = f !->• TTl U 7r2 i-> (-51 62)
r~ f r- '
TTl C 7r2
T^i L j j ^ 1 '^2 !=n "'a
{i} ^n {r}
"•a C „ TTi (TTI U Hi) Q„ (TT'I U TTJ)
A.<6 (axl)
(axT)
6^5'
5^6' (eq)
^2 E „ T i , Si :< S2
e>-^n^-^S :<S'[iJ.v.S'/iy]
i>-^ n 1-^ S :< nu.S'
Type contexts, F are functions mapping process variables and identifiers into
types, and locality variables into type specifications. Type contexts are written
as sequences of assignments X^ : (5i,... ,X„ : (5„,ui : {\i,5[),... ,Um'• {^m,^'m)-
The special symbol (j) is used to denote the empty context.
Before presenting the judgments of the type inference system we introduce
some useful notations.
136
- {nu.6')m = fiiy.is'm).
Hereafter, we write 6@i to denote a canonical form of type d@£. We also write
£'m to denote £ ii t = self, £' otherwise.
Given a type context F, we write r[S/X] to denote either the extension of
r with the assignment X : 5 (when X is unbound in F), or the updating of F
that binds X to S {F[{X,6)/u] has a similar meaning). The auxiliary function
updatef, indexed by the locality £ where a process is located when the function is
invoked, behaves hke the identity function for all fields but \X : 6 and ! w : (A, ^).
In the former case, the behaviour of the function is obvious. In the latter case,
the type specification assigned to u is obtained by replacing self with ^ in A
and by partial evaluating 6 at u. Formally, it is defined by:
{
updatef {update^{F,f),t) \i t = f,t,
r(-j- nil : J.
r|-p X • r{x)@i r^r A •. r{A)m
rirP'-S
rf-p out(t)@/.p: {s,{e'@e) n- {o} ^^ ±)
updatef{r,t)\-j- P •S updatef{r,t)\-f- <5N,„,,, = S'
r\-p read{t)m'.P : {5', {£'©£) i-> {r] H-> L)
updatef{r,t)[j-P :5 updatei{r,t)\-j- S\,^^ = S'
r\j- in{t)m'.p: {6', {e'm) *->. {i} ^ ±)
r\^P:5 r h p ^ Q : 6'
r\-f- eval(Q)@/.P : {5, {£'©£) i-> {e} H4 6')
r\^P\Q:{S,,5-2)
and with the binding between the process identifier A and a (possibly recur-
sive) candidate type 6. The resulting context is exploited to infer (up to type
equality) the type S for P. A second type inference is triggered by the rule, start-
ing from a type context that additionally contains the bindings for the locality
variables occurring as parameters in the process definition. This last inference
checks whether the intentions of process P comply with the type specifications
of its locality variables. To make types of process identifiers independent of the
localities where the identifiers are invoked, it is assumed that the latters are
always invoked at self.
The last typing rule is the rule for process invocation. First, it determines the
type of the process identifier and those of the process arguments. Then, it checks
whether the type inferred for any process argument agrees with that obtained
by partially evaluating at i the type of the corresponding formal parameter. No
requirement is imposed on the other arguments. The inferred type states that,
once we interpret the type of A starting from i (the actual locality where the
invocation takes place), A{P, t, e) intends to perform at I the same operations of
A at u. Indeed, the locality variables occurring as parameters in the definition of
139
the process may occur in the type inferred for the process identifier. Soundness
of the appUcation of [t/u\ to S follows from the assumption that all bound names
in the definition of A are distinct.
In [15], it is proved decidability and derivability of a minimal type for any
typable process. The type is called minimal because all the other deducible types
are greater than it. The main impact of the existence of a minimal type is that
the type inference system is decidable. The same results holds for the type system
presented in this paper.
Theorem 1. li r\j- P : S' then there exists a minimal type S such that r\-j-P :
S a.nd6^ 6" for all 6" such that T ^ ^ P:6".
Corollary 1. For any process P, the existence of a type 6 such that <j)\-j- P '• S
is decidable.
- {iJiv.6'){GNs]s = HP.5'{9NS]S-
Hence, localities that cannot be mapped to sites of the net are left unchanged.
Definition 7. A net Ns is well-typed if for any node s r:^^ P, there exists 6'
such that (f>\-^ P : 6' and if J is a minimal type for P then S{0j^g}s •< Ss-
6 Concluding R e m a r k s
We have developed a type system which formahzes access control restrictions of
programs written in KLAIM. Type information is used to specify access rights and
execution privileges, and to detect violations of these policies. The implementa-
tion of the type inference system for X - K L A I M (the prototype implementation
of KLAIM) is in progress; it will help us also to assess our design choices.
We plan to extend the type system by introducing types for tuples (record
types), notions of multi-level security (by structuring localities into levels of
security) and public or shared keys to model dynamic transmission of access
rights. Ideas could also be borrowed from the spi-calculus [2], a concurrent cal-
culus obtained by adding public-key encryption primitives to the 7r-calculus
[21], and from the SLam calculus [18], another calculus where information about
direct/indirect producers and consumers are associated to data.
Another direction for future research is considering "open" systems. In fact,
our type system can safely deal with new processes landing on existing nodes,
but it does not consider partially specified nets. An enrichment of types is then
needed to specify the permissions granted to "unspecified" sites. Then, the choice
has to be faced whether this enrichment should be specified at the level of single
nodes or at the level of nets. Interfacing nets would naturally fit with extensions
of our framework to hierarchical nets that would be beneficial also for more
structured access controls. An alternative approach to deal with open systems
could also be that of relaxing the static type checking phase by not requir-
ing well-typedness of the whole net (this corresponds to the fact that only the
typed sites can be trusted) while increasing the run-time type checking phase,
e.g. agents migrating from untyped (i.e. untrusted) sites must be dynamically
typechecked. This is the approach followed in [24].
Type systems have been used also for other calculi of mobile processes.
Among those reminiscent of ours, although not addressing security issues, we
mention the work of Pierce and Sangiorgi [23]. They develop a type system for
the TT-calculus using channels types to specify whether channels are used to read
or to write. This type system has been extended in [20] by associating multi-
plicities to types for stating the number of times each channel can be used. The
type system of [23] has been also generalized by Sewell [25] to capture locality of
channel names and by Boreale and Sangiorgi [7] to trimmer bisimulation proofs.
Only recently attempts have been made to characterize security properties
in terms of formal type systems. A type system for the spi-calculus has been
developed by Abadi [1] to guarantee secrecy of cryptographic protocols. Abadi
and Stata [3] have used type rules to specify and verify correctness of Java
Bytecode Verifier. Hennessy and Riely [19] have introduced a type system for
the language DTT, a distributed variant of vr-calculus for controlling the use of
resources in nets. This work is similar to ours (types are abstraction of pro-
cess behaviours and access rights violations are type errors), but the technical
developments are quite different; resources are channels and types describe per-
missions to use channels. Moreover, access rights are fixed irrespectively of the
localities where processes themselves are executed. In [24], the type system of
141
[19] has been improved for considering nets where sites can also put up malicious
agents that do not respect the rules on the use of resources. Cardelli and Gordon
[10] have introduced a type system for mobile ambients [9] that controls the type
of the values exchanged among administrative domains (ambients) so that the
communication of values cannot cause run-time faults. Volpano and Smith have
developed type systems to ensure secure information flow (noninterference) for
both a sequential procedural language [27] and for a multithreaded imperative
language [28]. Boudol [8] has used types to abstract from terms the possible
sequences of interactions and the resources used by processes. Necula [22] has
introduced an approach to ensure correctness of mobile code with respect to a
fixed safety policy, where code producers provide the code with a proof of cor-
rectness that code consumers check before allowing the code to execute. Vitek
and Castagna [26] have proposed a language-based approach, relying on power-
ful mobility and protection primitives, rather than on type systems, for secure
Internet programming. Bodei, Degano, Nielson and Nielson [6] have proposed
an alternative approach for the analysis of security and of information flow, that
relies on static analysis techniques.
Acknowledgments
We would like to thank Betti Venneri for her contribution to the development of
the KLAIM type system. We are also grateful to Lorenzo Bettini, Michele Boreale
and Michele Loreti for general discussions about KLAIM.
This work has been partially supported by Esprit Working Groups CON-
FER2 and COORDINA, and by CNR: Progetti "Metodologie e Strumenti di
Analisi, Verifica e Validazione per Sistemi Software AfEdabili" e "Modelli e
Metodi per la Matematica e I'lngegneria".
A Operational Semantics
\{p} is not defined whenever dom{X) ^ dom{p). In the latter case, we get a
process closure P{p} that is evaluated by using the laws in Table 9. There,
6{p} is obtained from S by using p to interpret shallow localities of S; it can
be defined inductively on the syntax of 6: all its clauses are structural apart for
(£ H^ TT H->. S'){p} = (.{p] H^ TT H-j. (5'.
The notion of conservative extension of the function used to induce the types
of the nodes of a net is needed to derive the new types of the nodes of the net in
case of dynamic reconfiguration. Indeed, when a set of new nodes is created, the
type specifications written by the programmer have to be taken into account.
In general, the types of the nodes of the net have to be "extended" by adding
the rights of the existing nodes over the new ones and the rights of the new
nodes over the existing ones. Access lists are used to enrich the rights of the
existing nodes, while types are used to determine the rights of the new nodes.
An extension is called conservative whenever the rights over the new nodes are
consistent with the rights over the creating node and the types specified for the
new nodes are compatible with those induced by the extension; the predicate
compat is used to this purpose.
nil{p} = nil
X{p} = X
{out(t)m.p){p} = out{t{p})m{p}.p{p}
{eva.l{Q)m.P){p} = eval(Q)@^{p}.P{p}
{init)m.P){p] = in(t{p})@e{p}.P{p}
{read{t)m.P){p} = rea.d{t{p})m{p}.P{p}
(newloc(u : (xj}).P){p} = newloc(u : (A^)).P{p}
(Pi \ P2){p} = P,{p} \ P2{p}
A(P,I,'S){p} = P[P/X,e/u,e/^{p} iiA{X •.5,u: (x~5},x) ''=^ P
e{p} = e
{i:{\,S)){p}=i{p]:{\{p],5)
!x{p} = ! X
{\u:{\,6)){p) = \u:{X{p],5)
{\X:6){p} = \X:8{p]
{fMp} = f{p},t{p}
6*^, for Si £ S U {s] U {s}, are the solution of the system of type equations
induced by the conservative extension of A with respect to {s}, u : {X,S), 0
and s, where & is the extension of 0NS, with respect to {s} and s;
Ns[6l./Ss]s-&s is the net Ng where the types 5s of the nodes are replaced
by St},
^{7}' if {«} = {si> • • •, Sn}, is the net Si :://j nil || . . . || s„ ::/,;; nil.
5'=p(£) et = r i t L
Ns II s ::* out(i)@f.P || s' ::*, P ' V ^ iVs || s 4 P \\ s' v."^, ( P ' j out(et))
s = PW
Ns II s 4 eval(Q)@£.P || s' ::f, P ' ^ ^ Ns | h •:' P II s' ^ ( P ' | Q)
s = p{l) 'match{T\t\^,et,s,©Ng,)
Ns II s 4 init)m.P II s' ::f, out(et) ^ - ^ Ns || 3 ::', P [ e t / r [ < ] , ] || s' ::f, nil
{5} n (5 u {«}) = 0
Ns II s ::*: newIoc(S : {X,S)).P y~^ Ns[S:jSs.Ues || s :4f P [ s / ^ || iV
U}
Ns II 5 ::^, P [ P / X J / S , e / J ] ^ iV , , c ; ~ ~ . ~ , ~, rfe/ „
^^.^3 A ( A : d,M : (A,d),a;) = P
N s l h : : ^ A(P,^,S);^yV
N = iVi Ni^-^ Ni N2 = N'
N y-^ N'
T a b l e 1 1 . Matching Rules
References
10. L. Caidelli, A. Gordon, Types for Mobile Ambients. Proc. of the ACM Symposium
on Principles of Program,ming Languages, ACM Press, 1999.
11. N. Carriero, D. Gelernter. Linda in Context. Communications of the ACM,
32(4):444-458, 1989.
12. G. Cugola, C. Ghezzi, G.P. Picco, G. Vigna. Analyzing Mobile Code Languages. In
Mobile Object System,s Towards the Programmable Internet (J. Vitek, C. Tschudin,
Eds.), LNCS 1222, Springer, 1997.
13. R. De Nicola, G. Ferrari, R. Pugliese. Coordinating Mobile Agents via Blackboards
and Access Rights. Coordination Languages and Models (COORDINATION'97),
Proceedings (D. Garlan, D. Le Metayer, Eds.), LNCS 1282, pp. 220-237, Springer,
1997.
14. R. De Nicola, G. Ferrari, R. Pugliese. KLAIM: a Kernel Language for Agents Interac-
tion and Mobility. IEEE Transactions on Software Engineering, Vol.24(5):315-330,
IEEE Computer Society Press, 1998.
15. R. De Nicola, G. Ferrari, R. Pugliese, B. Venneri. Types for Access Control. Avail-
able at h t t p : / / r a p . d s i . u n i f i . i t / p a p e r s . h t m l . To appear in Theoretical Com-
puter Science.
16. D. Gelernter. Generative Communication in Linda. ACM Transactions on Pro-
gramming Languages and Systems, 7(1):80-112, ACM Press, 1985.
17. D. Gelernter, N. Carriero, S. Chandran, et al. Parallel Programming in Linda. Proc.
of the IEEE International Conference on Parallel Programming, pp. 255-263, IEEE
Computer Society Press, 1985.
18. N. Heintz, J.G. Riecke. The SLam calculus; Programming with secrecy and in-
tegrity. Proc. of the ACM Symposium on Principles of Programming Languages,
ACM Press, 1998.
19. M. Hennessy, J. Riely. Resource Access Control in Systems of Mobile Agents. Proc.
Int. Workshop on High-Level Concurrent Languages, vol. 16(3) of Electronic Notes
in Theoretical Computer Science, Elsevier, 1998.
20. N. Kobayashi, B. Pierce, D. Turner. Linearity and the 7r-calculus. Proc. of the
ACM Symposium on Principles of Programming Languages, ACM Press, 1996.
21. R. Milner, J. Parrow, D. Walker. A calculus of mobile processes, (Part I and II).
Information and Computation, 100:1-77, 1992.
22. G. Necula. Proof-carrying code. Proc. of the ACM Symposium on Principles of
Programming Languages, ACM Press, 1997.
23. B. Pierce and D. Sangiorgi. Typing and subtyping for mobile processes. Mathe-
matical Structures in Comp. Science, 6(5):409-454, 1996.
24. J. Riely, M. Hennessy. Trust and Partial Typing in Open Systems of Mobile Agents.
Proc. of the ACM Symposium on Principles of Programming Languages, ACM
Press, 1999.
25. P. Sewell. Global/Local Subtyping and Capabihty Inference for a Distributed
TT-calculus. International Colloquium on Automata, Languages and Programming
(ICALP'98), Proceedings (K.G. Larsen, S. Skyum, G. Winskel, Eds.), LNCS 1443,
Springer, 1998.
26. J. Vitek, G. Castagna. A Calculus of Secure Mobile Computations. Proc. of Work-
shop on Internet Programming Languages, Chicago, 1998.
27. D. Volpano, G. Smith. A typed-based approach to program security. Theory
and Practice of Software Development (TAPSOFT'97), Proceeding (M. Bidoit,
M. Dauchet, Eds.), LNCS 1214, pp.607-621, Springer, 1997.
28. D. Volpano, G. Smith. Secure Information Flow in a Multi-threaded Imperative
Language. Proc. of the ACM Symposium on Principles of Programming Languages,
ACM Press, 1998.
Security Properties of Typed Applets
1 Introduction
What, exactly, makes strongly-typed applets more secure than untyped ones?
Most frameworks proposed so far for safe local execution of foreign code rely
on strong typing, either statically checked at the client side [19,44], statically
checked at the server side and cryptographically signed [35], or dynamically
checked by the client [8]. However, the main property guaranteed by strong typ-
ing is type soundness: "well-typed programs do not go wrong", e.g. do not apply
an integer as if it were a function. While violations of type soundness constitute
real security threats (casting a well-chosen string to a function or object type
allows arbitrary code to be executed), there are many more security concerns,
such as integrity of the running site (an applet should not delete or modify ar-
bitrary files) and confidentiality of user's data (an applet should not divulge
personal information over the network). The corresponding security violations
do not generally invalidate type soundness in the conventional sense.
If we examine the various security problems identified for Java applets [12],
some of them do cause a violation of Java type soundness [21]; others corre-
spond to malicious, but well-typed, uses of improperly protected functions from
the applet's execution environment [7]. Another typical example is the ActiveX
applet described in [10] that does a Trojan attack on the Quicken home-banking
software: money gets transferred from the user's bank account to some offshore
account, all in a perfectly type-safe way.
On these examples, it is intuitively obvious that security properties must
be enforced by the applet's execution environment. It is the environment that
eventually decides which computer resources the applet can access. This is the
essence of the so-called "sandbox model". Strong typing comes in the picture
only to guarantee that this environment is used in accordance with its publicized
interface. For instance, typing prevents an applet from jumping in the middle
of the code for an environment function, or scanning the whole memory space
of the browser, which would allow the applet to abuse or bypass entirely the
execution environment.
148
execution, but whose successive contents must always satisfy some invariant, i.e.
remain within a given set of permitted values.
The first motivation for this policy is to formalize the intuitive idea that an
applet must not trash the memory of the computer executing it. In particular,
the internal state of the browser, the operating system, and other applications
running on the machine must not be adversely affected by the applet.
This security pohcy can also be stretched to account for input/output behav-
ior, notably accesses to files and simple cases of network connections. A low-level,
hardware-oriented view of I/O is to consider hardware devices such as the disk
controller and network interface as special locations in the store; I/O is then
controlled by restricting what can be written to these locations. For a higher-
level view, each file or network connection can be viewed as a reference, which
can then be controlled independently of others. Here, the file system, the name
service and the routing tables become dictionary-like data structures mapping
file names, host names, and network addresses to the references representing files
and connections.
By concentrating on writes to sensitive locations, we focus on integrity prop-
erties of the system running the applet. It is also possible to control reads from
sensitive locations, thus estabhshing simple privacy properties. We will not do
it in this paper for the sake of simplicity, but the results of section 3 also extend
to controlled reads. More advanced privacy properties, as provided for instance
by information flow models, are well beyond our approach, however.
To enforce the security policy, we give a semantics to our language that monitors
reference assignments, and reports run-time errors in the case of illegal writes.
We use a standard big-step operational semantics in the style of [32,39,25].
Source terms are mapped to values, which are terms with the following syntax:
Values: •.= b values of base types
I Aa;.o[e] function closures
pairs of values
store locations
Results: := v/s normal termination
I err write violation detected
Environments: e : = [xi <(- vi
Stores: s := [£i ^ VI .
Normal rules:
The only unusual ingredient in this semantics is the <^ component, which
maps store locations to sets of values: if (p(£) is defined, values written to the
location i must belong to the set '^{i), otherwise a run-time error err is gener-
ated; if i{>{i) is undefined, any value can be stored at i. (See rules 10 and 11.)
For instance, taking ^piji) = 0 prevents any assignment to i.
The rules for propagating the e r r result and aborting execution (rules 12-
21) are the same rules as for propagating run-time type errors (wrong) in [39];
the only difference is that we have no rules to detect run-time type errors, thus
making no difference between run-time type violations and non-terminating pro-
grams (no derivations exist in both cases): the standard type soundness theorems
show that type violations cannot occur at run-time in well-typed source terms.
An unusual aspect of our formalism is that the store control <f must be
given at the start of the execution. The reason is that, with big-step operational
semantics, it does not suffice to perform a regular evaluation e,s h a —> w/s'
and observe the differences between s and s' to detect illegal writes. For one
thing, we would not observe temporary assignments, where a malicious applet
writes illegal values to a sensitive location, then restores the original values before
terminating. Also, we could not say anything about non-terminating terms: the
applet could perform illegal writes, then enter an infinite loop to avoid detection.
By providing the store control i^ in advance, we ensure that the first write error
will be detected immediately and reported as the e r r result.
Unfortunately, this provides no way to control stores to locations created
during the evaluation (rule 8 chooses these locations outside of Dom((p), meaning
that writes to these locations will be free): only preexisting locations can be
sensitive. (This can be viewed as an inadequacy of big-step semantics, and a
small-step, reduction-based semantics would fare better here. However, several
semantic features that play an important role in our study are easier to express in
a big-step semantics than in a reduction semantics: the clean separation between
browser-supplied environment and applet-supplied source term, and the ability
to interpret abstract type names by arbitrary sets of values.)
3 Reachability-based security
The first security property for our calculus formalizes the idea that an applet can
only write to locations that are reachable from the initial environment in which
it executes, or that are created during the applet's execution. For instance, if the
references representing files are not reachable from the execution environment
given to applets, then no applet can write to a file.
Reachability, here, is to be understood in the garbage collection sense: a
location is reachable if there exists a path in the memory graph from the initial
environment to the location, following one or several pointers. More formally, we
define the set RL(v, s) of locations reachable from a value w in a store s by the
152
following equations:
RL{b, s) = 0
RL{\x.a[e],s) = RL{e,s)
RL{{vi,V2),s) = RL{vi,s)URL{v2,s)
RL{£,s) = {e}URL{s{^),s)
RL{e,s)=^ [j RL{e{x),s)
xeDom(e)
environment in which it proceeds; the applet does not have access to the full
execution environment of the browser — as would be the case in a dynamically-
scoped language, such as Emacs Lisp, or a language with special constructs to
access the environment of the caller, such as Tel.
In defining reachable locations, we have treated closures like tuples (as garbage
collectors do): the locations reachable from Ax.a[e] are those reachable from e{y)
for some y 6 Dom(e). There is, however, a big difference between closures and
tuples. Tuples are passive data structures: any piece of code that has access to
the tuple can then obtain pointers to the components of the tuple. Closures are
active data structures: only the code part of the closure can access directly the
data part of the closure (the values of the free variables); other code fragments
can only apply the closure, but not access the data part directly. In other words,
the code part of a closure mediates access to the data part. This property is
often referred to as procedural abstraction [34].
For instance, consider the following function, similar to many Unix sys-
tem calls, where uid is a reference holding the identity of the caller (applet
or browser):
Assume this function is part of the applet environment, but not the reference
uid itself. Then, there is no way that the applet can modify the location of uid,
even though that location is reachable from the environment.
A less obvious example, where the reference uid is not trivially read-only, is
the following function in the style of the Unix s e t u i d system call:
Anewid. if !uid = browser
then uid := newid
e l s e raise an error
Assuming uid is not initially browser, an applet cannot change uid by calling
this function.
Procedural abstraction can be viewed as the foundation for access control
lists and similar programming techniques, which systematically encapsulate re-
sources inside functions that check the identity and credentials of the caller
before granting access to the requested resources. For instance, a file opening
function contains the whole data structure representing the file system in its
closure, but grants access only to files with suitable permissions. Thus, while all
files are reachable from the closure of the open function, only those that have
suitable permissions can be modified by the caller.
To formalize these ideas, we set out to define the set of locations ML{v, s)
that are actually modifiable (not merely reachable) from a value v and a store s,
154
and show that if a location i is not in ML{e,s), then any applet evaluated in
the environment e does not write to i. This result is stronger than Property 1
because the location £ that is not modifiable from e in s can still be reachable
(via closures) from e in s.
For passive data structures (locations, tuples), modifiability coincides with
reachabihty: a location is modifiable from v/s if a sequence of f s t , snd, and ! op-
erations applied to u in s evaluates to that location. The difficult case is defining
modifiable locations for a function closure. The idea is to consider all possible
applications of the closure to an argument: a location I is considered modifiable
from the closure only if one of those applications writes to the location, or causes
the location to become modifiable otherwise.
More precisely, let Capi be the execution environment given to applets, and
let c = Aa;.a[e] be one of the closures contained in Capi- A location £ is modifiable
from c in store s if there exists a value v such that the following conditions hold:
Condition 1. The application of the closure to v causes £ to be modified, i.e.
{£ !-)• 0}, e{x •(- u}, s h o -> e r r .
Example: Let e be the environment [r i- i]. Then, i is reachable from {Xx. r :=
a; + l)[e] in any store, since any application of the closure causes i. to be assigned.
In a store s such that s(£„) = 0, the location £ is not modifiable from Sapiig)-
However, £ is modifiable from e„pi(/) in s, since one application of that closure
returns a store s' such that s'(£„) = 1, and in that store s', t \s modifiable from
Sapiig)'- ^•iiy application of enpi{g) with initial store s' writes to £.
155
Condition 5. In conditions 1-4 above, it must be the case that the location t
found to be modifiable from the closure c is not actually modifiable from the
argument v passed to the closure. Otherwise, we would not know whether the
location really "comes from" the closure c, or is merely modified by the applet-
provided argument v.
Example: Consider the higher-order function c = (A/. /(O))[0]. If we apply c to
(An. r := Vb)\r <- £], we observe a write to location I. However, i should not be
considered as modifiable from c, since it is also modifiable from the argument
given to c.
As should now be apparent from the conditions 1-5 above, the notion of mod-
ifiability raises serious problems, both practical and technical. On the practical
side, the set of modifiable locations ML{v, s) is not computable from v and s: in
the closure case, we must consider infinitely many possible arguments. Thus, a
full mathematical proof is needed to determine ML{v,s).
Moreover, modifiable locations cannot be determined locally. As condition 4
shows, the modifiable locations of a closure depend on the modifiable locations
of all functions from the applet environment Capi- Thus, if we manage to deter-
mine ML{eapi,s), then add one single function to the applet environment, we
must not only determine the modifiable locations from the new function, but
also reconsider all other functions in the environment to see whether their mod-
ifiable locations have changed. This is clearly impractical. Hence, the notion of
modifiability is not effective and is interesting only from a semantic viewpoint
and as a guide to derive decidable security criteria in the sequel.
On the technical side, the conditions 1-5 above do not lead to a well-founded
definition of the sets of modifiable locations ML{v, s). The problem is condition 5
(the requirement that the location must not be modifiable from the argument
given to the closure): viewing conditions 1-4 as a fixpoint equation for some
operator, that operator is not increasing because of the negation in condition 5.
In appendix B, we tackle this problem and show that non-modifiable locations
are indeed never modified in the particular case where the applet's environment
Bapi is well-typed and its type Eapi does not contain any ref types, so that no
references are exchanged directly between the applet and its environment. In
the remainder of this paper, we abandon the notion of modifiability in its full
generality, and develop more effective techniques to restrict writes to reachable
locations, relying on type-based instrumentation of the browser code.
(p, e, s I- a -^ r
(22)
ifi,e,s\- T{a) —> r
<fi,e,s\~ a ^ v/s t 6 Dom(PV) implies v 6 PV{t)
(23)
(p,e,s h OKt{a) -> v/s
f,e,s\-a-^ err
(24)
ip,e,s\- OKt(a) —>• err
this respect, our named types behave very much like the i s new type definition
in Ada, and unhke type abbreviations in ML. Making the coercions exphcit fa-
cihtates the definition of the program transformations in section 5, ensuring in
particular that each term has a unique type.
The mapping TD of type definitions is essentially global: type definitions
local to an expression are not supported. Still, it is possible to type-check some
terms against a set of type definitions TD' that is a strict subset of TD, thus
rendering the named types not defined in TD' abstract in that term. We will use
this facility in section 5.2 and 5.3 to make named types abstract in the applet.
The other unusual feature of our type system is the family of operators OKt (one
for each named type t) used to perform run-time validation of their argument.
For each named type t, we assume given a set PV{t) of permitted values for
type t. (We actually allow PV{t) to be undefined for some types t, which we
take to mean that all values of type t are valid.) The expression OKt{e) checks
whether the value of e is in PV{t); if yes, it returns the value unchanged; if
no, it aborts the execution of the applet and reports an error. In the evaluation
rules, the "yes" case corresponds to rule 23; there is no rule for the "no" case,
meaning that no evaluation derivation exists if an OKt test fails. In effect, we
do not distinguish between failure of OKt and non-termination. At any rate, we
158
must not return the e r r result when OKt fails: no write violation has occurred
yet.
By varying PV^t), we can control precisely the values of type t that will pass
run-time validation. For instance, Py(filename) could consist of all strings
referencing files under the applet's temporary directory /tmp/applet.x, a new
directory that is created empty at the beginning of the applet's execution. Com-
bined with the techniques described in section 5, this would ensure that only
files in this temporary directory can be accessed by the applet. Similarly, the
set P y (widget) could consist of all GUI widget descriptors referring to widgets
that axe children of the applet's top widget, thus preventing the applet from
interacting with widgets belonging to the browser. Other examples of run-time
validation include checking cryptographic signatures on the applet code itself or
on sensitive data presented by the applet.
In practice, validation OKt{e) involves not only the value of its argument e,
but also external information such as the identity of the principal, extra capa-
bility arguments passed to the validation functions, and possibly user replies
to dialog boxes. A typical example is the Java SecurityManager class, which
determines the identity of the principal by inspection of the call stack [43]. For
simphcity, we still write OKt as a function of the value of its argument.
The evaluation rule for OKt assumes of course that membership in PV{t)
is decidable. This raises obvious difficulties if t stands for a function type, at
least if the domain type is infinite. Difficulties for defining PV{t) also arise if t
is a reference type: checking the current contents of the references offers no
guarantees with respect to future modifications; checking the locations of the
references against a fixed set of locations is very restrictive. For those reasons,
we restrict ourselves to types t that are defined as algebraic datatypes: type
expressions obtained by combining base types with datatype constructors such
as l i s t or tuples, but not with ref nor the function arrow.
- S 1= 6 : t if Typeofib) = i
- S[=v:tiiS^v: TD{t)
- S \= Xx.a[e\ : Ti —> T2 if there exists a typing environment E such that
S \=^ e : E and E h Xx.a : ri —^ r2
- 5 1= (wi, v-i) : Ti X r2 if S 1= vi : Ti and S \= v-z : T2
- 5 1= £ : T ref if T = S{t)
- S\=e:E\i Dom(e) = Dom(£) and for all x € Dom(e), S [= e{x) : E{x)
- 1= s : 5 if Dom(s) == Dom(5) and for all I e Dom(s), 5 f= s{(.) : 5(£).
159
Using the semantic typing relations defined above, we then have the familiar
strong soundness property below for the type system. We say that a store typing
S" extends another store typing S if Dom(5") 3 Dom(5), and for all £ E Dom(5),
we have S'{C) = S{f.). Remark that semantic typing is stable under store exten-
sion; ii S \= V • T and S' extends S, we also have S' ]= v : T.
applet environment
applet environment
»{i(?)--»{o^
-(my*-~i^{^-
applet environment
applet environment
the store control that restricts references of type t ref, t £ Dom.{PV) to values in
PV{t), and allows arbitrary writes to other references: Prot{PV,S){l) = PV{t)
if 5(£) = i and t e Dom(Py), and Prot,{PV, S){e) is undefined otherwise.
Lemma 3 only provides half of the security property: it shows that writes
in instrumented code are safe, but only the execution environment contains
instrumented code; the applet code is not instrumented and could therefore
perform illegal writes to sensitive locations, if it could access those locations.
In other terms, we must make sure that all sensitive locations are encapsulated
inside functions, as in section 3.2. To this end, we will restrict the type Eapi of the
applet's execution environment Sapi to ensure that sensitive references cannot
"leak" into the applet, and be assigned illegal values there. There are several
ways by which a sensitive reference of type t ref could leak into an applet:
We rule out all these cases by simply requiring that no type t ref occurs (at any
depth) in Eapi- This leads to the following security property:
The requirement that no t ref occurs in Eapi is clearly too strong: nothing
wrong could happen if, for instance, one of the environment functions has type
t ref -^ u n i t (the t ref argument is provided by the applet). We conjecture that
it suffices to require that no type t ref occurs in Eapp at a positive occurrence
or under a ref constructor. However, our proof of Property 2, and in particular
the crucial containment lemma, does not extend to this weaker hypothesis.
Of course, this is not enough: the applet could forge unchecked values of type t,
by direct coercion from t's implementation type, and pass them to environment
functions. Hence, we also need to make the types t e Dom(Py) abstract in the
applet, by type-checking it with a set of type definitions TD' obtained from
TD by removing the definitions of the types t £ Dom{PV). Then, for any
t € Dom(PV), the only values of type t that can be manipulated by the applet
have been created and checked by the environment. This is depicted in Fig. 6.
To capture the run-time behavior of instrumented terms, we introduce a
variant of the semantic typing predicate, written PV, S \= v : T, which is similar
to the predicate S \= v : T from section 4.3, with the difference that a value v
belongs to a named type t only ii v £ PV{t) in addition to v belonging to the
definition TD{t) of t:
- PV,S\=b: i if Typeof{h) = i
- PV,S\=v:t\iPV,S\=v: TD{t) and t € V)om{PV) implies v G PV{t)
163
Security property 3 Let e be the execution environment for applets, and s the
initial store. Assume that all function closures in e and s have been instrumented
with the IC scheme (that is, e and s are obtained by evaluating source terms
instrumented with IC). Assume PV \= s : S and PV, S [^ e : E. Then, for
every applet a well-typed in E and in the restricted set TD' of type definitions,
we have Prot{PV, S), s,e\- a -/^ e r r .
out [34], type abstraction and procedural abstraction are two orthogonal ways
to protect data.
Another advantage of the approach described in this section over the in-
strumentation of writes described in section 5.1 is that it often leads to fewer
run-time checks. In particular, checks at coercions can sometimes be proven re-
dundant and therefore can be eliminated. Consider the following function that
adds a . old suffix to a file name:
With the definition of P y (filename) given in section 4.2, it is easy to show that
if / belongs to PV(filename), then so does the concatenation of / and .old.
Hence, the OK test can be removed.
Of course, not all run-time tests can be removed this way: consider what
happens if the suffix is given as argument:
Af . As. OiiTfiienaneCf i l e n a m e ( c o n c a t ( s t r i n g ( f ) , s ) ) )
In some cases, the types t 6 Dom(Py) cannot be made abstract in the applet,
e.g. because it would make writing the applet too inconvenient, or entail too
much run-time overhead. We can adapt the approach presented in section 5.2
to these cases, by reverting to procedural abstraction and putting checks not
only at coercions, but also on all values of types t E Dom{PV) that come from
the applet. (This matches current practice in Unix kernels, where parameters
to system calls are always checked for validity on entrance to the system call.)
Figure 7 depicts this approach.
The checking of values coming from the applet is achieved by a standard
wrapping scheme applied to all functions of the execution environment, inserting
OKt coercions at all negative occurrences of types t £ Dom(Py). For instance,
if the execution environment needs to export a function / ; i —> t, it will actually
export the function Xx.f{OKt{x)), which validates its argument before passing
it to the original function.
We formalize these ideas in a slightly different way, in order to build upon
the results of section 5.2. Start from an applet environment defined by top-level
bindings of the form
l e t fi -.Ti = ai
We assume given a set TD' of type definitions against which the a, and the
applets are type-checked, and a valuation PV assigning permitted values to
named types in TD'.
We first associate a new named type i to each sensitive type t € D o m ( P y ) .
The type i is defined as synonymous with t, and is intended to represent those
165
values of type t that have passed run-time validation. We define the i types by
taking
TD=TD'®[iy-^t\tG Dom{PV')]
and restrict the values they can take using the vaJuation PV defined by PV{i) =
PV'{t) and PV undefined on other types.
Let S be the substitution {i •*- t | t e Dom(PV")}. We transform the
bindings for the applet environment as follows:
That is, we rewrite the terms aj to use the type i instead of t for all t €
Doxn{PV'); then apply the IC instrumentation scheme to it, thus adding an
OKf check to each coercion i{a); finally, apply the W"^ wrapping scheme to the
instrumented term, in order to perform both validation and coercion from t to i
on entrance, and the reverse coercion from f to i on exit. Wrapping is directed
by the expected type for its result, and is contravariant with respect to function
types. We thus define both a wrapping scheme W~^ for positive occurrences of
types and another W~ for negative occurrences.
W+{a:i) =zW-{a:L) = a
W+{a : t) = t{a) if t € Dom(PV^)
W-{a : t) = OKi{i{a)) if t E Dom(Py)
W^{a •f)=W-{a:t) = a if t ^ Dom(Py)
W+{a : Ti X ra) = (H^+(fst(a) : n ) , W^+(snd(a) : T^)
W-{a : n X ra) = (VK-(fst(a) : n ) , W-(snd(a) : T2)
W+{a : Ti ^ T2) == Xx.W+{a{W-{x : n ) ) : T2)
W~{a : n -^ T-i) = Xx.W-ia{W+ix : n ) ) : T2)
W+{a : T ref) = W~{a : r ref) = a
if no i € Dom{PV) occurs in T
We are therefore back to the situation studied in section 5.2: the types i
are abstract in the applets and all coercions to i are instrumented in the applet
environment. Thus, by Property 3, we obtain that the values of references with
types i ref always remain within PV{t.) = PV'(t) during the execution of any
well-typed applet.
Security property 4 Let e be the execution environment for applets and s the
initial store. Assume that e and s are obtained by evaluating a set of transformed
bindings l e t fi : n = W'^{IC{S{ai)) : Ti) as described above. Assume PV \=
s : S and PV, S \= e : E. Then, for every applet a well-typed in E and in the
initial set TD' of type definitions, we have Prot{PV,S),s,e \~ a -/> e r r .
presents the main techniques used in the MMM safe execution environment for
applets, and relates them with the formal results we have obtained in this paper.
that performs sound static type-checking and records correctly all external mod-
ules referenced) and has not been modified afterwards (e.g. by hand-modifying
some of the interfaces in the object file). This can be enforced in two ways in
the context of Web applets.
The first way is to transmit applets over the network as Caml source code,
which is type-checked and compiled locally (by calling the Objective Caml byte-
code compiler), then dynamically linked inside the browser. Unless the local host
is compromised, no tampering with the object files nor the Caml type-checker
itself can happen. Unlike applet systems based on source-level interpretation, we
still benefit from the efficiency of the Caml bytecode interpreter.
Transmission of applets in source form is often criticized on several grounds:
the source code is larger than compiled bytecode; local compilation takes too
much time; applet writers do not want to publicize their source. Our experience
with Caml is that bytecode object files are about the same size as the source
code (unless heavily commented); the bytecode compiler is very fast and the
compilation times are small compared with Internet latencies; finally, bytecode
is easy to decompile and does not offer significant protection against reverse
engineering.
Another way to ensure the correctness of type annotations in bytecode object
files is to rely on a cryptographic signature on the object file, which is checked
locally by the M M M browser against a list of trusted signers before linking the
object file in memory. Unlike Microsoft-style applet signing, this signature is
not necessarily made by the author of the applet, and carries no guarantees on
what the applet actually does; instead, the signature is made by the person or
site who performs the compilation and type-checking of the applet, and certifies
only that the applet passed type-checking by an unmodified Caml compiler and
that its object file has not been tampered since.
We initially envisioned having some centralized type-checking authority: one
or several reputable sites (such as INRIA) that accept source code from applet
developers, type-check and compile them locally, sign the object file and return it
to the applet developer. The object file can then be made available on the Web.
Seeing the authority's signature, browsers can trust the type information con-
tained in the object file and use it to validate the applet against the environment
they provide. In retrospect, this approach relies too much on a centralized au-
thority to scale up to the world-wide Web. However, it is perfectly suited to the
distribution of compiled applets across a restricted network such as a corporate
Intranet.
menu items, viewers for new types of embedded documents, display functions
for new HTML tags, and decoding functions for new "content-encoding" types.
The applet environment is derived from the OCaml standard library mod-
ules and the MMM implementation modules by two major techniques: hiding
of unsafe functions via module thinning, and wrapping of unsafe functions with
capability checks.
Module thinning: The ML module system offers the ability to take a restricted
view of an existing module via a signature constraint:
module RestrM = (M : RestrSig)
Only those components of M that are mentioned in the signature RestrSig are
visible in RestrM, and they have the types specified in RestrSig, which may
be less precise (more abstract) than their original types in M. This module thin-
ning mechanism thus supports both hiding components (functions, variables,
types, exceptions, sub-modules) of a module and making some type components
abstract. No code duplication occurs during thinning: the functions in RestrM
share their code with the corresponding functions in M.
Large parts of the applet environment are obtained by thinning existing
OCaml library modules. In the OCaml standard library, we hide by thinning all
file input-output functions, as well as related system interface functions (such
as reading environment variables and executing shell commands), and of course
all type-unsafe operations (such as array accesses without bound checks and
functions that operate on the low-level representation of data structures). For
good measure, we also make abstract a few data types such as lexer buffers to
hide their internal structure.
In the CamlTk GUI toolkit, we hide all functions that return widgets that
may belong to the browser or to other applets. Such functions include finding
the parent of a widget, finding a widget by its name, finding which widget owns
the focus or the selection, etc. This is less restrictive than checking that widgets
manipulated by the applet are children of its top-level widget: the applet can still
open new windows and populate them with widgets unrelated with its initial top-
level widget. Other functions that might affect browser widgets (such as binding
events on all widgets of a given class or tag) are also removed.
extension for applets loaded from the local disk. Capabilities are of course rep-
resented by an abstract data type, to prevent an applet from forging extra ca-
pabilities.
All input/output operations as well as registration of browser extensions
check the capabilities presented by the applet. If the applet does not possess the
capability to perform the requested operation, the browser prompts the user via
a pop-up window. The user can then refuse the operation (aborting the execution
of the applet), grant permission for this particular operation, or grant permission
for further operations of the same kind as well. In the latter case, the browser
extends in place the capabilities of the applet. To minimize the number of times
the user is prompted, an applet can also request in advance the capabilities it
needs later.
let mycapa = C a p a b i l i t i e s . g e t ( )
let open_in = open_in_capa mycapa
let open_out = open_out_capa mycapa
... code using open_in and open_out as usual...
The remainder of the applet can then use the functions open_in and open_out
thus obtained by partial application as if they were the normal file opening
functions.
MMM makes this approach more convenient by using ML functors to per-
form the partiaJ applications. Functors are parameterized modules, presented
as functions from modules to modules. They provide a convenient mechanism
for parameterization en masse: rather than partially applying n functions to m
parameters, a structure containing the m parameters is passed once to a functor
that return a structure containing the n functions already partially applied to the
parameters. In the case of MMM, we thus have a number of functors that take
the applet's capability as argument and return structures defining capability-
enabled variants of the standard I/O and browser interface functions, with the
same interface as for standalone programs. For instance, to perform file I/O, an
MMM applet does the following:
171
module MyCapa =
struct
let capabilities = Capabilities.get()
end
module 10 = Safeio(MyCapa)
open 10
. . . code using open_in and other I/O operations as usual...
7.3 Assessment
The MMM security architecture relies heavily on the three basic ingredients that
we considered in our formalization:
- Lexical scoping ensures that applets cannot access ail the modules composing
the browser and the Caml standard library, but only those "safe" modules
made available to the applet during linking.
- Type abstraction prevents the applet from forging or tampering with its
capability list. It also ensures that the applet cannot forge file descriptors or
GUI widgets, but has to go through the safe libraries to create them.
- Procedural abstraction is used systematically to wrap sensitive functions
(such as opening files or network connections, as well a.s installing a browser
extension) with capability checks.
Resources that can be explicitly deallocated and reassigned later raise in-
teresting problems. A prime example is Unix file descriptors, which are small
integers. The type of file descriptors is abstract, and an applet cannot forge a
file descriptor itself. However, it could open a file descriptor on a permitted file,
close it immediately, keep the abstract value representing the descriptor, and
wait until the browser opens a file or network connection and receives the same
file descriptor in return. Then, the applet can do input/output directly on the
file descriptor, thus accessing unauthorized files or network connections. This
"reuse" attack is of course possible only with explicitly-deallocated resources:
with implicit deallocation, the resources cannot be deallocated and reallocated
as long as the applet keeps a handle on the resource. MMM addresses this prob-
lem by wrapping Unix file descriptors in an opaque data structure containing
a "valid" bit that is set to false when the file descriptor is closed, and checked
before every input/output operations.
The last MMM security feature not addressed by our framework is confiden-
tiality. Our framework focuses on ensuring the integrity of the browser and of
the host machine. The MMM applet environment contains security restrictions
to ensure integrity (such as controlled write access to files), but also restrictions
intended to protect the confidentiality of user data (such as controlled read ac-
cesses to files and restricted access to the network). Some restrictions address
both integrity and confidentiality problems: for instance, a malicious document
decoder could both distort the document as displayed by the browser (an in-
tegrity threat) and leak confidential information contained in the document to
a third party (a confidentiality threat).
8 Related work
8.1 Type systems for security
The work most closely related to ours is the recent formulations of Denning's
information flow approach to security [13,14] as non-standard type systems by
Palsberg and 0rbaek [31], Volpano and Smith [41,40], and Heintze and Riecke
[20]. (Abadi et al. [2] reformulate some of those type systems in terms of a more
basic calculus of dependency.) The main points of comparison with our work are
listed below.
Information flow vs. integrity: The type systems developed in previous works
all focus on secrecy properties, following the information flow approach. In par-
ticular, they allow high-security data to be exposed as long as no low-security
code uses this data. Our work focuses on more basic integrity properties via ac-
cess control. We view those integrity guarantees as a prerequisite to establishing
meaningful confidentiality properties.
Imperative vs. purely functional programs: [31] and [20] consider purely
functional languages in the style of the A-calculus. This makes formulating the
security properties delicate: [31] proves no security property properly speak-
ing, only a subject reduction property that shows the internal consistency of the
calculus, but not its relevance to security; [20] does show a non-interference prop-
erty (that the value of a low-security expression is independent of the values of
high-security parameters), but it is not obvious how this result applies to actual
applet/browser interactions, especially input/output. Instead, we have followed
[41,40] and formulated our security policy in terms of in-place modifications on
a store, which provides a simple and intuitive notion of security violation.
Run-time validation of data: Only [31] and our work consider the possibility
of checking low-security data at run-time and promoting them to high security. In
[41,40,20], once some data is labeled "low security", it remains so throughout
the program and causes all data it comes in contact with to be marked "low
security" as well. We believe that, in a typical applet/browser interaction, this
policy leads to rejecting almost all applets as insecure. Run-time validation of
untrusted data is essential in practice to allow a reasonable range of applets to
run.
Subtyping vs. named types and coercions: All previous works consider
type systems with subtyping, which provides a good match for the flow analysis
approach they follow [30]. In contrast, we only use type synonyms with possibly
checked coercions between a named type and its implementation type. However,
the connections between subtyping and explicit coercions are well known [9], and
we do not think this makes a major difference.
9 Concluding remarks
We have identified three basic techniques for enforcing a fairly realistic security
policy for applets: lexical scoping, procedural abstraction, and type abstraction.
These programming techniques are of course well known, but we believe that this
work is the first to characterize precisely their implications for program security.
The techniques proposed here seem to match relatively well current practice
in the area of Web applets. In particular, they account fairly well for Rouaix's
implementation of safe libraries in the MMM browser.
Our techniques put almost no constraints on the applets, except being well-
typed in a simple, completely standard type system. The security effort is con-
centrated on the execution environment provided by the browser. Typing the
applets in a richer type system, such as the type systems for information flow of
[31,41,40,20] or the effect and region system of [38], could provide more infor-
mation on the behavior of the applet and enable more flexible security policies
in the execution environment. However, it is impractical to rely on rich type
systems for applets, because these type systems are not likely to be widely ac-
cepted by applet developers. Whether these rich type systems can be applied to
the execution environment only, while still using a standard type system for the
applets, is an interesting open question.
On the technical side, the proofs of the type-based security properties are
variants of usual type soundness proofs. It would be interesting to investigate
the security content of other classical semantic results such as representation
independence and logical relations. Given the importance of communications
between the applet and its environment, it could be worthwhile to reformulate
our security results for a calculus of communicating processes [6,3,1].
Acknowledgements
This work has been partially supported by GIE Dyade under the "Verified In-
ternet Protocols" project. A preliminary version of this paper appeared in the
175
References
1. M. Abadi. Secrecy by typing in security protocols. In Theoretical Aspects of
Computer Software '97, volume 1281 of Lecture Notes in Computer Science, pages
611-638. Springer-Verlag, Sept. 1997.
2. M. Abadi, A. Banerjee, N. Heintze, and J. G. Riecke. A core calculus of dependency.
In 26th symposium Principles of Programming Languages, pages 147-160. ACM
Press, 1999.
3. M. Abadi and A. D. Gordon. Reasoning about cryptographic protocols in the Spi
calculus. In CONCUR'97: Concurrency Theory, volume 1243 oi Lecture Notes in
Computer Science, pages 59-73. Springer-Verlag, July 1997.
4. D. S. Alexander, W. A. Arbaugh, M. W. Hicks, P. Kakkar, A. D. Keromytis, J. T.
Moore, C. A. Gunter, S. M. Nettles, and J. M. Smith. The SwitchWare active
network architecture. IEEE Network, 12(3):29-36, 1998.
5. D. S. Alexander, W. A. Arbaugh, A. D. Keromytis, and J. M. Smith. Security in
active networks. In J. Vitek and C. Jensen, editors, Secure Internet Programming,
Lecture Notes in Computer Science. Springer-Verlag Inc., New York, NY, USA,
1999.
6. J.-P. Banatre and C. Bryce. A security proof system for networks of communicating
processes. Research report 2042, INRIA, Sept. 1993.
7. J.-P. Billon. Security breaches in the JDK 1.1 beta2 security API. Dyade, h t t p : /
/www.dyade.fr/fr/actions/VIP/SecHole.html, Jan. 1997.
8. N. S. Borenstein. Email with a mind of its own: the Safe-Tel language for en-
abled mail. In IFIP International Working Conference on Upper Layer Protocols,
Architectures and Applications, 1994.
9. V. Breazu-Tannen, T. Coquand, C. A. Gunter, and A. Scedrov. Inheritance as
implicit coercion. Information and Computation, 93(1):172-221, 1991.
10. K. Brunnstein. Hostile ActiveX control demonstrated. RISKS Forum, 18(82), Feb.
1997.
11. L. Cardelli, S. Martini, J. C. Mitchell, and A. Scedrov. An extension of system F
with subtyping. Information and Computation, 109(l-2):4-56, 1994.
12. D. Dean, E. W. Felten, D. S. Wallach, and D. Balfanz. Java security: Web browsers
and beyond. In D. E. Denning and P. J. Denning, editors, Internet Besieged:
Countering Cyberspace Scofflaws, pages 241-269. ACM Press, 1997.
13. D. E. Denning. A lattice model of secure information flow. Commun. ACM,
19(5):236-242, 1976.
14. D. E. Denning and P. J. Denning. Certification of programs for secure information
flow. Commun. ACM, 20(7):504-513, 1977.
15. S. Drossopoulou and S. Eisenbach. Java ib type safe - probably. In Proc. 11th
European Conference on Object Oriented Programming, volume 1241 of Lecture
Notes in Computer Science, pages 389-418. Springer-Verlag, June 1997.
16. M. Erdos, B. Hartman, and M. Mueller. Security reference model for the Java
Developer's Kit 1.0.2. JavaSoft, http://java.sun.com/security/SRM.html, Nov.
1996.
17. S. N. Freund and J. C. Mitchell. A type system for object initialization in the
Java bytecode language. In Object-Oriented Programming Systems, Languages
and Applications 1998, pages 310-327. ACM Press, 1998.
176
39. M. Tofte. Type inference for polymorphic references. Information and Computa-
tion, 89(1), 1990.
40. D. Volpano and G. Smith. A type-based approach to program security. In Proceed-
ings of TAPSOFT'97, Colloquium on Formal Approaches in Software Engineer-
ing, volume 1214 of Lecture Notes m Computer Science, pages 607-621. Springer-
Verlag, 1997.
41. D. Volpano, G. Smith, and C. Irvine. A sound type system for secure flow analysis.
Journal of Computer Security, 4(3):1-21, 1996.
42. D. S. Wcillach, D. Balfanz, D. Dean, and E. W. Felten. Extensible security ar-
chitectures for Java. Technical report 546-97, Department of Computer Science,
Princeton University, Apr. 1997.
43. D. S. Wallach and E. W. Felten. Understanding Java stack inspection. In Pro-
ceedings of the 1998 IEEE Symposium on Security and Privacy. IEEE Computer
Society Press, 1998.
44. F. Yellin. Low level security in Java. In Proceedings of the Fourth International
World Wide Web Conference, pages 369-379. O'Reilly, 1995.
In this appendix, we formalize the intuition t h a t if the type of the applet environ-
ment does not contain certain r e f types, then references of those types cannot
be exchanged between the applet and the environment, and remain "contained"
in one of them.
We a n n o t a t e each source-language term a as coming either from t h e execution
environment {uenv) or from the applet {aapp)- We let m,n range over the two
"worlds" env and app, and write m for the complement of m, i.e. emJ = app and
'app = env.
Let T be a set of type expressions satisfying t h e following closure property:
if r € T , then all types r ' t h a t contain T as a sub-term also belong to T. (In
section 5.1, we take T to be the set of all types containing an occurrence of
t ref; in appendix B, T is the set of all types containing an occurrence of any
r e f type). We partition the set of locations into three countable sets:
To ensure t h a t locations allocated during evaluation are drawn from the correct
set, we assume all source terms aj„ annotated with their static type r and their
world m and replace the evaluation rule for reference creation (rule 8) by the
following rule:
178
Lemma 7 (Containment lemma). Let Eapi be the type of the execution envi-
ronment for applets. Assume Eapi{x) ^ T for all x € Dom(jEapi). Further assume
E h am '• T and S \= e : E and \= s : S and Cm{e,s). If (f,e,s h Om —> v/s',
then Cm{v,s'), and for all values iv and worlds n such that Cn{w,s), we have
Cn{w,s').
Proof. The proof is by induction on the evaluation derivation and case analysis
on a. Notice that by Lemma 2, we have the additional result that there exists a
store typing 5 ' extending S such that S' \= v : T and \= s' : S'. This makes the
semantic typing hypotheses go through the induction. The interesting cases are
assignment and function application; the other cases are straightforward.
Assignment: a is (af " ' := 0-2). We apply the induction hypothesis twice,
obtaining
Hence, vi is a closure Ax.a^[e'], and the last evaluation rule used is rule 4.
If n = m (intra-world call), we have Cm{e-'{x <- V2},S2) as a consequence
of Cm{vi,S2) and Cm.{v2,S2), and the two conclusions follows easily from the
induction hypothesis applied to the evaluation of a^.
If n = m (cross-world call), then the type <T —> r of the function must
occur as a sub-term of the typing environment Eapi- Hence, a ^ T and T ^ T.
Since the type of V2 is not in T, by Lemma 6 it follows that Cn{v2,S2)- Hence,
Cn{e'{x <— V2},S2), and we can apply the induction hypothesis to the evaluation
of a'„: tp, e'{x <— D2}, S2 ^' ojj -> v/s'. The resulting value v is contained in world
n and has type r; applying again Lemma 6, we get Cm{i>,s'), which is the
expected result.
ML{b, s) = 0
ML{{vi,V2),s) = ML{vi,s)UML{v2,s)
181
The case for closures follows conditions 1 and 4 in the informal discussion from
section 3.2. For condition 5, we use the condition Capp{vi,si) instead of the
more natural i ^ ML{vi, si), so that the equations remain increasing in ML and
the existence of the smallest fixpoint is guaranteed. The typing hypothesis (that
Eapi contains no ref types) renders condition 3 vacuous, and also dispenses us
with defining ML over locations.
The following lemma show that modifiable locations are indeed the only
locations modified during the appHcation of a closure.
Concepts
The Role of Trust Management in Distributed
Systems Security
1 Introduction
W i t h the advent of the Internet, distributed computing has become increasingly
prevalent. Recent developments in programming languages, coupled with t h e in-
crease in network bandwidth and end-node processing power, have m a d e t h e Web
a highly dynamic system. Virtually every user of the Internet is at least aware of
languages such as Java [15], JavaScript, Active-X, and so on. More "futuristic"
projects involve computers running almost exclusively downloaded interpreted-
language applications (Network P C ) , or on-the-fly programmable network infras-
tructures (Active Networks). On a more m u n d a n e level, an increasing number
of organizations use the Internet (or large Intranets) to connect their various
offices, branches, databases, etc.
All of these emerging systems have one thing in common: t h e need to grant or
restrict access to resources according to some security policy. There are several
issues worth noting.
186
First, different systems and applications have different notions of what a re-
source is. For example, a web browser may consider CPU cycles, network band-
width, and perhaps private information to be resources. A database server's
notion of "resource" would include individual records. Similarly, a banking ap-
plication would equate resources with money and accounts. While most of these
resources can be viewed as combinations of more basic ones (such as CPU cy-
cles, I/O bandwidth, and memory), it is often more convenient to refer to those
combinations as single resources, abstracting from lower-level operations. Thus,
a generic security mechanism should be able to handle any number and type of
resources.
What should also be obvious from the few examples mentioned above is
that different applications have different access-granting or -restricting policies.
The criteria on which a decision is based may differ greatly among different
applications (or even between different instances of the same application). The
security mechanism should be able to handle those different criteria.
One security mechanism often used in operating systems is the Access Control
List (ACL). Briefly, an ACL is a list describing which access rights a principal has
on an object (resource). For example, an entry might read "User Foo can Read
File Bar." Such a list (or table) need not physically exist in one location but may
be distributed throughout the system. The t/nix^^-filesystem "permissions"
mechanism is essentially an ACL.
ACLs have been used in distributed systems, because they are conceptually
easy to grasp and because there is an extensive literature about them. How-
ever, there are a number of fundamental reasons that ACLs are inadequate for
distributed-system security, e.g.,
directly specify overall security policy; rather, all they can do is "certify"
lower-level authorities. This authorization structure leads easily to inconsis-
tencies among locally-specified sub-policies.
- Expressibility and Extensibility: A generic security mechanism must be able
to handle new and diverse conditions and restrictions. The traditional ACL
approach has not provided sufficient expressibility or extensibility. Thus,
many security policy elements that are not directly expressible in ACL form
must be hard-coded into applications. This means that changes in security
policy often require reconfiguration, rebuilding, or even rewriting of applica-
tions.
- Local trust policy: The number of administrative entities in a distributed
system can be quite large. Each of these entities may have a different trust
model for different users and other entities. For example, system A may trust
system B to authenticate its users correctly, but not system C; on the other
hand, system B may trust system C. It follows that the security mechanism
should not enforce uniform and implicit policies and trust relations.
^ Sometimes the public keys are hardcoded into the apphcation, which deprives the
environment of even the hmited flexibility provided by ACLs. Such an example is
the under-development IEEE 1394 Digital Content Protection standard.
188
2 Trust Management
2.1 Basics
In the rest of this section, we survey several recent and ongoing trust-manage-
ment projects in which different answers to these questions are explored.
190
One goal of the PolicyMaker project was to make the trust-management en-
gine minimal and analyzable. Architectural boundaries were drawn so that a
fair amount of responsibility was placed on the calling application rather than
the trust-management engine. In particular, the calling application was made
responsible for all cryptographic verification of signatures on credentials and
requests. One pleasant consequence of this design decision is that the applica-
tion developer's choice of signature scheme(s) can be made independently of his
choice of whether or not to use PolicyMaker for compliance checking. Another
important responsibility that was assigned to the calling application is creden-
tial gathering. The input (r, C, P) supplied to the trust-management module is
treated as a claim that credential set C contains a proof that request r complies
with Pohcy P. The trust-management module is not expected to be able to dis-
cover that C is missing just one credential needed to complete the proof and to
go fetch that credential from e.g., the corporate database, the issuer's web site,
the requester himself, or elsewhere. Later trust-management engines, including
KeyNote [4] and REFEREE [11] divide responsibility between the calling appU-
cation and the trust-management engine differently from the way PolicyMaker
divides it.
2.3 KeyNote
KeyNote [4] was designed according to the same principles as PolicyMaker, using
credentials that directly authorize actions instead of dividing the authorization
193
task into authentication and access control. Two additional design goals for
Key Note were standardization and ease of integration into applications. To ad-
dress these goals, Key Note assigns more responsibility to the trust-management
engine than PolicyMaker does and less to the calling application; for example,
cryptographic signature verification is done by the trust-management engine in
KeyNote and by the application in PolicyMaker. KeyNote also requires that
credentials and policies be written in a specific assertion language, designed to
work smoothly with KeyNote's compliance checker. By fixing a specific and ap-
propriate assertion language, KeyNote goes further than PolicyMaker toward
facilitating efficiency, interoperability, and widespread use of carefully written
credentials and pohcies.
A calling apphcation passes to a KeyNote evaluator a list of credentials, poli-
cies, and requester public keys, and an "Action Environment." This last element
consists of a list of attribute/value pairs, similar in some ways to the Unix^^
shell environment. The action environment is constructed by the calling applica-
tion and contains all information deemed relevant to the request and necessary
for the trust decision. The action-environment attributes and the assignment of
their values must reflect the security requirements of the application accurately.
Identifying the attributes to be included in the action environment is perhaps the
most important task in integrating KeyNote into new applications. The result
of the evaluation is an application-defined string (perhaps with some additional
information) that is passed back to the application. In the simplest case, the
result is something Hke "authorized."
The KeyNote assertion format resembles that of e-mail headers. An example
(with artificially short keys and signatures for readability) is given in Figure 1.
KeyNote-Version: 1
Authorizer: rsa-pkcsl-hex:"1023abcd"
Licensees: dsa-hex:"986512al" I I
rsa-pkcsl-hex:"19abcd02"
Comment: Authorizer delegates read
access t o e i t h e r of the
Licensees
Conditions: ( $ f i l e == "/etc/passwd" &&
$access == "read") ->
{return "ok"}
Signature: rsa-mdS-pkcsl-hex:"fOOf5673"
KeyNote assertions are structured so that the Licensees field specifies ex-
phcitly the principal or principals to which authority is delegated. Syntactically,
the Licensees field is a formula in which the arguments are public keys and the
operations are conjunction, disjunction, and threshold. The semantics of these
expressions are specified in [4].
The programs in KeyNote are encoded in the Conditions field and are essen-
tially tests of the action environment variables. These tests are string compar-
isons, numerical operations and comparisons, and pattern-matching operations.
We chose such a simple language for KeyNote assertions for the following
reasons:
- AWK, one of the first assertion languages used by PohcyMaker, was criticized
as too heavyweight for most relevant applications. Because of AWK's com-
plexity, the footprint of the interpreter is considerable, and this discourages
application developers from integrating it into a trust-management compo-
nent. The KeyNote assertion language is simple and has a minimal-sized
interpreter.
- In languages that permit loops and recursion (including
AWK), it is difficult to enforce resource-usage restrictions, but applications
that run trust-management assertions written by unknown sources often
need to limit their memory- and CPU-usage.
We believe that for out purposes a language without loops, dynamic memory
allocation, and certain other features is sufficiently powerful and expressive.
The KeyNote assertion syntax is restricted so that resource usage is propor-
tional to the program size. Similar concepts have been successfully used in
other contexts [16].
- Assertions should be both understandable by human readers and easy for
a tool to generate from a high-level specification. Moreover, they should be
easy to analyze automatically, so that automatic verification and consistency
checks can done. This is currently an area of active research.
- One of our goals is to use KeyNote as a means of exchanging policy and
distributing access control information otherwise expressed in an applica-
tion-native format. Thus the language should be easy to map to a number
of such formats {e.g., from a KeyNote assertion to packet-filtering rules).
- The language chosen was adequate for KeyNote's evaluation model.
Note that the decision to consult a CRL is (or should be) a matter of local policy.
196
REFEREE was designed with trust management for web browsing in mind,
but it is a general-purpose language and could be used in other applications.
Some of the design choices in REFEREE were influence by experience (reported
in [6]) with using PolicyMaker for web-page filtering based on PICS labels [26]
and users' viewing policies. It is unclear whether the cost of building and an-
alyzing a more complex trust-management environment such as REFEREE is
justified by the ability to construct more sophisticated proofs of compliance than
those constructible in PolicyMaker. Assessing this tradeoff would require more
experimentation with both systems, as well as a rigorous specification and anal-
ysis of the REFEREE proof system, similar to the one for PolicyMaker given in
[7].
The Simple Public Key Infrastructure (SPKI) project of Ellison et al. [13]
has proposed a standard format for authorization certificates. SPKI shares with
our trust-management approach the belief that certificates can be used directly
for authorization rather than simply for authentication. However, SPKI certifi-
cates are not fully programmable; they are data structures with the following five
fields: "Issuer" (the source of authority), "Subject" (the entity being authorized
to do something), "Delegation" (a boolean value specifying whether or not the
subject is permitted to pass the authorization on to other entities), "Authoriza-
tion" (a specification of the power that the issuer is conferring on the subject),
and "Validity dates." The SPKI certificate format is compatible with the Sim-
ple Distributed Security Infrastructure (SDSI) local-names format proposed by
Rivest and Lampson [19], and Ellison et al. [13] explain how to integrate the
two.
The SPKI documentation [13] states that
The processing of certificates and related objects to yield an authoriza-
tion result is the province of the developer of the application or system.
The processing plan presented here is an example that may be followed,
but its primary purpose is to clarify the semantics of an SPKI certificate
and the way it and various other kinds of certificate might be used to
yield an authorization result.
Thus, strictly speaking, SPKI is not a trust-management engine according to
our use of the term, because compliance checking (referred to above as "process-
ing of certificates and related objects") may be done in an application-dependent
manner. If the processing plan presented in [13] were universally adopted, then
SPKI would be a trust-management engine. The resulting notion of "proof of
compliance" would be considerably more restricted than PolicyMaker's; essen-
tially, proofs would take the form of chains of certificates. On the other hand,
SPKI has a standard way of handling certain types of non-monotonic policies,
because validity periods and simple CRLs are part of the proposal.
There has been a great deal of interest in the problem of exposing the abihty
to control of network infrastructure. Much of this interest has been driven by a
the desire to accelerate service creation. Sometimes services can be created using
features of existing systems. One of the most aggressive proposals is the notion
of programmable network infrastructure or "active networking." In an active
network, the operator or user has facilities for directly modifying the operational
semantics of the network itself. Thus, the role of network and endpoint become
far more malleable for the construction of new applications. This is in contrast to
the "service overlay model" as employed, for example, in present-day Internet,
where service introduction at the "edge" of the virtual infrastructure is very easy,
but changes in the infrastructure itself have proven very difficult {e.g., RSVP
[8] and multicasting [12]). A number of active network architectures have been
proposed and are under investigation [1, 28, 16, 27, 9, 25].
A programmable network infrastructure is potentially more vulnerable to
attacks since a portion of the control plane is intentionally exposed, and this
can lead to far more complex threats than exist with an inaccessible control
plane. For example, a denial-of-service attack on the transport plane may also
inhibit access to the control plane. Unauthenticated access to the control plane
can have severe consequences for the security of the whole network.
It is therefore especially necessary for an active network to use a robust and
powerful authorization mechanism. Because of the many interactions between
network nodes and switchlets (pieces of code dynamically loaded on the switches,
or code in packets that is executed on every active node they encounter), a
versatile, scalable, and expressive mechanism is called for.
We have applied KeyNote [4] in one proposed active network security archi-
tecture, in the Secure Active Network Environment (SANE) [2] [3] developed at
the University of Pennsylvania as part of the Switch Ware project [1].
In SANE, the principals involved in the authorization and policy decisions
in the security model are users, programmers and administrators and network
elements. The network elements are presumed to be under physical control of
an administrator. Programmers may not have physical access to the network
element, but may possess considerable access rights to resources present in the
network elements. Users may have access to basic services {e.g., transport), but
also resources that the network elements are willing to export to all users, at an
appropriate level of abstraction. Users may also be allowed to introduce their
own services, or load those written by others. In such a dynamic environment,
KeyNote is used to supply policy and authorization credentials for those compo-
nents of the architecture that enforce resource usage and access control limits.
In particular, KeyNote policies and credentials are used to:
Another area of broad recent interest is the security and containment of un-
trusted "mobile" code. That is, executable content or mobile code is received
by a host with a request to a execute it; lacking any automatic mechanisms for
evaluating the security implications of executing such a piece of code, the host
needs to find some other way of determining the trustworthiness of that code.
Failure to properly contain mobile code may result in serious damage or
leakage or information resident on the host. Such damage can be the result of
malicious intent {e.g., industrial or otherwise espionage or vandalism), or un-
intentional {e.g., programming failures or unexpected interactions with other
system components or programs). Other consequences of failing to contain mo-
bile code include denial-of-service attacks (the now familiar phenomenon of Java
and JavaScript applets using all the system memory or CPU cycles, usually
'^ In the case of SANE, the execution environment is a modified and restricted runtime
of the Caml [20] programming language.
199
In the time since "trust management" first appeared in the literature in [5], the
concept has gained broad acceptance in the security research community. Trust
management has a number of important advantages over traditional approaches
such as distributed ACLs, hardcoded security pohcies, and global identity cer-
tificates. A trust-management system provides direct authorization of security-
critical actions and decouples the problem of specifying policy and authorization
from that of distributing credentials.
Our work on trust management has focused on designing languages and com-
pliance checkers, identifying applications, and building practical toolkits. There
are important areas that we have not yet addressed. Foremost among those is
automated credential discovery; in our current systems, it is the responsibility of
the requester to submit all necessary credentials, under the assumption that he
holds all credentials relevant to him. Even then, however, intermediate creden-
tials that form the necessary connections between the verifier's policy and the
requester's credentials must be acquired and submitted. The various solutions
to this problem range from "leave it to the application" to using distributed
databases and lookup mechanisms for credential discovery. A solution along the
201
References
sertions have yet signed off on it (or anything else). The checker will run the
assertions (/o,POLICY), ( / i , s i ) , . . . , (/„_i, s„_i) that it has received as input,
not necessarily in that order and not necessarily once each, and see which ac-
ceptance records are produced. Ultimately, the compliance checker approves the
request r if the acceptance record (0, POLICY, R), which means "policy approves
the initial action string," is produced.
Thus, abstractly, an assertion is a mapping from acceptance sets to accep-
tance sets. Assertion {fi,Si) looks at an acceptance set A encoding the actions
that have been approved so far and the numbers and sources of the assertions
that approved them. Based on this information about what the sources it trusts
have approved, {fi,Si) outputs another acceptance set A'.
The following concrete examples show why PolicyMaker assertions are al-
lowed to approve multiple action strings for each possible request. That is, for
a given input request r, why do assertions need to do anything except say "I
approve r" or refuse to say it?
First, consider the following "co-signing required" assertion (/o,POLICY):
"All expenditures of $500 or more require approval by A and B." Suppose that
A's policy is to approve such expenditures if and only if B approves them and
that B's is to approve them if and only if A approves them. Our acceptance
record structure makes such approvals straightforward. The credential (/i,A),
can produce acceptance records of the form {1,A,R) and {1,A,RB), where R
corresponds to the input request r; the meaning of the second is "I will approve
R if and only if B approves it." Similarly, the credential (/2,B), can produce
records of the form {2,B,R) and {2,B,RA)- On input {{A,A,R)}, the sequence
of acceptance records (1, A, fis), {2,B,RA), (l,A,i?), (2,3, R), (0, POLICY, i?)
would be produced if the assertions were run in the order (/i. A), (/2,B), (/i, A),
(/2,B), (/o, POLICY), and the request r would be approved. If assertions could
only produce binary approve/disapprove decisions, no transactions would ever be
approved, unless the trust management system had some way of understanding
the semantics of the assertions and knowing that it had to ask A's and B's
credentials explicitly for a conditional approval. This would violate the goal of
having a general-purpose, trust management system that processes requests and
assertions whose semantics are only understood by the calling applications and
that vary widely from application to application.
Second, consider the issue of "delegation depth." A very natural construction
to use in assertion (/o, POLICY) is "I delegate authority to A. Furthermore, I
allow A to choose the parties to whom he will re-delegate the authority I've
delegated to him. For any party B involved in the approval of a request, there
must be a delegation chain of length at most two from me to B." Various "domain
experts" Bi, ..., Bt could issue credentials ( / i , B i ) , ..., {ft,Bt) that directly
approve actions in their areas of expertise by producing acceptance records of
the form {i,Bi,Rl). An assertion {gj,Sj) that sees such a record and explicitly
trusts Bi could produce an acceptajice record of the form {j, Sj,R\), the meaning
of which is that "Bi approved R^ directly, I trust Bi directly, and so I also
approve i ? \ " More generally, if an assertion (<;;, s;) trusts s/t directly and sees an
205
acceptance record of the form (fe,Sfc,i?J;), it can produce the acceptance record
{l,si,R'^^^). The assertion (/o,POLICY) given above would approve an action
W if and only if it were run on an acceptance set that contained a record of
the form {k,A,R\), for some k. Note that (/o,POLICY) need not know which
credential {fi,Bi) directly approved i?' by producing {i,Bi,R]^). All it needs to
know is that it trusts A and that A trusts some Bi whose credential produced
such a record.
The most general version of the compliance-checking problem is:
Proof of Compliance (POC):
Input: A request r and a set {(/o,POLICY), (/i,.si), ..., ( / „ - i , s „ - i ) } of
assertions.
Question : Is there a finite sequence i i , i 2 , . . . , it of indices such that each ij is
in { 0 , 1 , . . . ,n — 1}, but the ij's are not necessarily distinct and not necessarily
exhaustive o f { 0 , l , . . . , n — 1} and such that
Promise : Each (/i,Sj) runs in time 0{N''). On any input set that contains
{A, A, R), where R is the action string corresponding to request r, for each (/j, Sj)
there is a set Oi of at most m action strings such that {fi,Si) only produces
output from Oi, and s is the maximum size of an acceptance record {i,Si, Rij),
where Rij € Oi.
Question : Is there a sequence ii,...,it of indices such that
Each version of POC can be defined using "agglomeration" [f-^, s-^) * (/i, si)
instead of composition (/2,S2) ° ( / i , s i ) . The result of applying the sequence
of assertions (/ii,SiJ, ..., (/i, ,Si,) agglomeratively to an acceptance set 5o is
defined inductively as follows: Si = (/ii,Sij)(5'o) U So and, for 2 < j < t,
Sj = (fi.,Si.){Sj-i) U Sj-i. Thus, for any acceptance set A, A C {fit,SiJ *
• • • * (/jj, Sjj) (A). The agglomerative versions of the decision problems are iden-
tical to the versions already given, except that the acceptance condition is
"(0,POLICY,i?) e {fi,,Si,)*---*{fi,,Si,){{{A,A,R)})r' We refer to "agglom-
erative POC," "agglomerative MPOC," etc., when we mean the version defined
in terms of * instead of o.
208
compliance checker. The CCAi algorithm checks for violations of the promise
every time it simulates an assertion. The pseudocode for these checks is omitted
from the statement of CCAi given here, because it would not illustrate the basic
structure of the algorithm; the predicate IllFormedi) is included in the main
loop to indicate that the checks are done for each simulation.
F i g . 2 . P s e u d o c o d e for Algorithm C C A i
Note that CCAi does mn iterations of the sequence ( / „ - i , 5„_i), ..., (/i, si),
(/o, POLICY), for a total of mri^ assertion-simulations. Recall that a set F =
{(/ji>Sji).--->(/i,:Si,)} C {(/o, POLICY), ..., (/„_i, s„_i)} "contains a proof
that r complies with POLICY" if there is some sequence fci,..., A;„ of the indices
ill • • • litj not necessarily distinct and not necessarily exhaustive oi jx,... ,jt,
such that (0,POLICY, i ? ) e (/it„,SfcJ * • • • * (/fc.,SfcJ({(yl,yl,i?)}).
The following formal claim about this algorithm is proven in [7].
Theorem 1. Let (r, {(/o, POLICY), (/i, s i ) , . . . , (/„_i, s„_i)}, c, m, s) be an
(agglomerative) LBMAPOC instance.
(1) Suppose that F C {(/o, POLICY), ( / i , s i ) , . . . , (/„_i,s„_i)} contains a
proof that r complies with POLICY and that every {fi,Si) E F satisfies the
promise of LBMAPOC. Then CCAi accepts (r, {(/o, POLICY), (/i, si), . . . ,
(/n-l, Sn-l) },C,m,s).
(2) If {(/o, POLICY), (/i, s i ) , . •., (/n-i, Sn-i)} does not contain aproof that
r complies with POLICY, then CCAi rejects (r, {(/o, POLICY), (/i, si), . . . ,
[fn-l, Sn-l) ],C,m,s).
210
Note that cases (1) and (2) do not cover all possible inputs to CCAi. There
may be a subset F of the input assertions that does contain a proof that r
complies with POLICY but that contains one or more ill-formed assertions. If
CCAi does not detect that any of these assertions is ill-formed, because their
ill-formedness is only exhibited on acceptance sets that do not occur in this com-
putation, then CCAi will accept the input. If it does detect ill-formedness, then,
as specified here, CCAi may or may not accept the input, perhaps depending
on whether the record (0, POLICY, B.) has already been produced at the time of
detection. CCAi could be modified so that it restarts every time ill-formedness
is detected, after discarding the ill-formed assertion so that it is not used in the
new computation. It is not clear whether this modification would be worth the
performance penalty. The point is simply that CCAi offers no guarantees about
what it does when it is fed a policy that trusts, directly or indirectly, a source
of ill-formed assertions, except that it will terminate in time 0{mn^{nmsy). It
is the responsibility of the poHcy author to know which sources to trust and to
modify the policy if some trusted sources are discovered to be issuing ill-formed
assertions.
Finally, note that 0{mn^ {nmsY) is a pessimistic upper bound on the running
time of the compliance checker. It is straightforward to check (each time an
assertion {fi,Si) is run, or at some other regular interval) whether the acceptance
record (0, POLICY, il) has been produced and to "stop early" if it has. Thus,
for many requests R that do comply with policy, the algorithm CCAi will find
compliance proofs in time less than 0{jnn^{nmsy).
Distributed Access-Rights Management with
Delegation Certificates
Tuomas A u r a
1 Introduction
New distributed discretionary access control mechanisms such as S P K I [14] and
PolicyMaker [8,9] aim for decentralization of authority and management oper-
ations. They do not rely on a trusted computing base (TCB) like traditional
distributed access control [22]. Instead, the participants are assumed to b e un-
trusted the way computers on open networks (e.g. Internet) are in reality.
The decentralization, however, does not mean sMding back t o anarchy such
as the P G P web of trust [26]. T h e new systems offer ways of building local rela-
tions and setting up local authorities t h a t arise from the personal and business
connections of the participants. The access control mechanisms do not m a n d a t e
any hierarchical or fixed domain structure like, for example, Kerberos [18] a n d
DSSA [16]. All entities are equally entitled to distribute rights t o the services in
their control and to act as an authority for those who depend on t h e m for t h e
services.
The main mechanism used in the new access control systems is delegation
of access rights with signed certificates. T h e signing is done with public-key
cryptography. With a certificate, one cryptographic key delegates some of its
authority to another key. T h e certificates can form a comphcated network t h a t
reflects the underlying relations between the owners of the private signature
keys.
By taking the cryptographic keys as their principal entities, the systems avoid
dependence on trusted name and key services such as the X.500 directory [12].
If any names are used, they are not global distinguished names but relative t o
the users [1,24].
This paper explains the principles behind the delegation certificates in an
abstract setting without exposing the reader to implementation details. T h e dis-
cussion is based primarily on the S P K I draft s t a n d a r d although we will not touch
212
issuer
Fig. 1. With a delegation certificate, the issuer shares authority with the subject.
5/c (During the validity period Ti — T2, if I have any of the rights R,
I give them also to K'.)
{Ski- • •) denotes a signed message that includes both the signature and the
original message.) The key that signed the certificate (K) is the issuer and
213
the key to whom the rights are given (K') is the subject of the certificate, and
the rights R given by the certificate are the authorization (following the SPKI
terminology). With the certificate, the issuer delegates the rights to the subject.
All delegation certificates have a validity period (Ti — T2) specified on them.
When the certificate expires, the subject loses the rights that the certificate may
have given to it. Together with the authorization field, this parameter is used
for regulating the amount of trust the issuer places on the subject. Extremely
short validity periods are used to force on-line connections to the issuer. (For
simplicity, we often omit the validity period in the text below.)
The authorization is usually the right to use certain services. Sometimes, it
can be an attribute that the subject uses as a credential to acquire access rights.
Such attributes can be interpreted as abstract rights that may not directly entitle
the subject to any services but help in acquiring such rights. The syntax of the
authorization is application dependent and each apphcation must provide its
own rules for comparing and combining authorizations. (See [14] Sec. Examples
for some typical authorizations.)
Some characteristics of delegation certificates are that any key can issue
certificates, a key may delegate rights that it does not yet have but hopes to
obtain, and the issuer itself does not lose the rights it gives to the subject. In
the following, we will discuss these and other properties of delegation in detail.
Our view of the world is key-oriented. The entities possessing, delegating
and receiving access rights are cryptographic key pairs. The public key is used
to identify the key pair and to verify signatures. The private key can sign mes-
sages. It is held secret by some physical entity that uses the key and the rights
attributed to the key at its will. All keys are generated locally by their owners.
There is no hmit on the number of keys one physical entity may own. On the
other hand, if a physical entity is to receive any rights from others, it must be
represented by at leEist one key. Most public-key infrastructures (e.g. X.509) are
identity-oriented. In them, access rights are given to names of entities and the
names are separately bound to keys.
Delegation certificates also differ from traditional access control schemes in
that any key may issue certificates. There is no central or trusted authority that
could control the flow of access rights. All keys are free to delegate access to
services in their control.
The delegation takes effect only to the extent that the issuer of a certificate
itself has the authority it is trying to delegate. Nevertheless, it is perfectly legal
to issue delegation certificates for rights that one does not yet have or for broader
rights than one has in the hope that the issuer may later obtain these rights.
Sometimes a key may delegate all of its rights to another key. The policy for
calculating the access rights received by the the subject when the issuer itself
does not possess all of the rights listed in the certificate depends on the type of
authorizations in question. In this paper, we consider only set-type authorizations
as most literature [4,14]. The subject gets the intersection of the rights held by
the issuer and the rights mentioned in the certificate. Since this limitation is
214
Since the signing of a certificate happens locally at the physical entity pos-
sessing the private issuer key, the act of signing does not invalidate any existing
certificates or affect existing rights. The subject of the certificate gets new rights
but the issuer does not lose any.
In this way, delegation is less powerful than transfer of rights where the
originating entity loses what it gives. On the other hand, delegation is far easier
to implement. The simplicity of the implementation (signing a certificate) in a
distributed environment is the reason why delegation is preferable to transfer as
an atomic access control primitive. (See Sec. 4.1 about implementing transfer.)
This section shows how the certificates are used as a proof of authority and how
the rights can be passed forward through several keys and certificates.
We begin by considering the simple case of access right verification where the
owner of a service has issued a certificate directly to the user of the service (like
K delegates to the public key K' above). When the user wants to use its rights,
it signs a request with its private key {K') or in some other way authenticates
(e.g. by estabhshing an authenticated session) the access request with the private
key. This can be construed as redelegating the rights to the request. The user
attaches the delegation certificate to the access request and sends both to the
server.
Every service platform has either a single master public key that controls all
access to it or access control lists (ACL) that determine the privileged public
keys for each service. When the server receives an access request with an at-
tached delegation certificate, it first verifies that the certificate is signed by a
key controlling the requested service (K). It then checks that the authorization
in the certificate is for the service and that the key requesting the service is
the same as the subject of the certificate {K'). In this scenario, the certificate
behaves like a capability. The signature protects the capability from falsification
and binds it to the subject key.
Just as a key may delegate rights to services it directly controls, it may also
redelegate rights it received by delegation from other keys. In this paper we
assume that redelegation is always allowed unless a certificate explicitly forbids
it.
When a key delegates to another key and this key in turn redelegates to a
third one, and so on, the delegation certificates form a chain. In the chain, the
access rights flow from issuers through certificates to subjects. The original issuer
is usually the service producer and the final subject is a client of the service.
If all the certificates in the chain delegate the same access rights and specify
the same validity period, these rights are passed all the way from the first issuer
215
to the subject of the last certificate. But it is not necessary for the certificates
to have the same authorization field and validity period. We remember that the
rights obtained by the subject key are the intersection of the rights possessed
by the issuer and the authorization field of the certificate. Consequently, the
rights passed through the chain of certificates can be computed by taking the
intersection of the rights possessed by the first issuer with the authorizations
on all the certificates in the chain. (Sometimes the intersection can be empty
meaning that no rights are passed all the way.) Likewise, the validity period
of the of the chain is the intersection of the periods specified on the individual
certificates. For example, in the chain of Fig. 2, Key4 receives the right to the
web pages "http://S/file" for today only from key Keyl.
/v\ /^y]
-Access request-
Read http://S/file.
Server S Signed: Key4 Client
controlled by Key1 owner of Key4
Since the certificates can be issued by anyone to anyone, they do not neces-
sarily form simple chains. Instead, the certificates form a graph structure called
delegation network. In the delegation network, there may be many chains of cer-
tificates between the same pair of keys. Naturally, the rights passed between two
keys are the union of the rights passed by all individual chains between them.
(Set-type authorizations are combined with the union operator. Other policies
are possible for other types of authorizations.)
When a key requests a service and it has obtained the access rights through
a chain of certificates, it attaches the entire chain to its request (Fig. 2). The
server will verify that the chain originates from a key controlling access to the
requested service, that it ends to the key making the request, that each certificate
in the chain is signed by the subject of the previous certificate, and that each
certificate in the chain authorizes the request. Several chains of certificates can
be attached if the combined rights delegated by them are needed for the access.
The signing of an access request can also be thought of as redelegation to
the request. Furthermore, the request may be program code. In that case, it is
natural to think that the last key in a chain redelegates to the code and the code
makes the actual requests. Consequently, delegation certificates should be used
216
to express this delegation. This is done so that the last key signs a delegation
certificate where the subject is not a key but a hash value of the program code.
We will see an example of this in Sec. 2.4.
In a way, the delegation certificates behave like signed requests for capability
propagation (see e.g [17]). The requests are honored only if the signer itself has
the capability. However, it is not necessary for the server or for a trusted party
to process each propagation before the next one is made, and no new capabilities
need to be produced before the rights are used. Instead, the information is stored
in the form of the delegation certificates.
Although the most obvious way of managing certificates is to accumulate
them along the chain of delegation and to attach them to the service requests,
there is no compelling reason to do this. The certificates can be stored and
managed anywhere as long as the verifier gets them in the end. The accumulation
is not always even possible if the certificates are not issued and renewed in the
order of their positions in a particular chain.
Managing long chains of certificates can become a burden. A technique called
certificate reduction saves work in verifying the certificate chains. We observe
that two certificates forming a chain
SxiiK'z has the rights Ri.) and 5/^2(i^s has the rights i?2-)
imply a direct delegation from Ki to K3:
This section explains the rationale behind delegation with certificates. We em-
phasize suitability for open distributed systems with no globally accepted au-
thority. The advantages center around the high level of distribution achieved
both in authority and management work load.
The first key to the distribution is that the entities are represented by their
cryptographic keys instead of names. Names are a natural way of specifying
entities for humans but they are less suitable for cryptographically secure au-
thentication. Ellison [13] discusses the complicated connection between keys and
identities. In a key-oriented world, we don't need trusted third parties to certify
217
the binding between a name and a public key. Instead, the pubUc key is used
directly to specify an entity. Thus, the centralized or hierarchical certification
authorities (CA) that are the heart of traditional identity-based access control
lose their role in key-oriented systems. This is a major security advantage. For
example in X.509, the name certificates come from a global hierarchy of offi-
cials (CAs) who must all be trusted with respect to any access control decision
whose security depends on the correct mapping between a key and a name.
Key-oriented access control avoids such obvious single points of failure. Fig. 3
illustrates the difference. It becomes even more obvious when we remember that,
in a truly name-oriented system, the mapping from name to key must be done
also for the service owner and for every key in a chain of delegation.
Identity-oriented Key-oriented
Owner of service S,
Certification autfiority
Service S.
Service S.
ceptually differentiate between keys that are allowed to grant access and ones
that only use services. From technical point of view, all keys are equal regard-
less of the importance of rights they handle. Any key may issue certificates to
others and distribute access rights to the services in its control without asking
permission from any other entity.
The bottom-up formation of policy is the most distinguishing property of
certificate-based access control. The certificates are issued locally by the entities
that produce the services and, therefore, should be responsible for granting ac-
cess to them. The certificates are formal documents of local trust relationships
that arise from voluntary personal, technical and business relations between the
entities possessing private keys. Nothing is mandated by global authorities. In a
chain of delegation certificates, every link is a result of a local policy decision.
Because of redelegation, the local decisions by individual key owners have global
consequences.
The lack of enforced hierarchical structure makes the system open and scal-
able. Setting up new a new entity is as simple as generating a signature key. New
keys may be created locally as new physical entities or services are introduced.
In comparison, most traditional access control systems achieve scalability with
a hierarchy of trusted entities or domains and require meta-level maintenance
operations for changes in the hierarchy [12,16,18].
An important observation in the certificate management is that the storage
and distribution of certificates is a separate concern from their meaning [8,9].
The integrity of the certificate data is protected by the signatures. Thus, un-
trusted entities can be allowed to handle them. Often, organizations will want
to set up certificate databases for access-right acquisition and management. Sim-
ilarly, most application software packages are unlikely to include their own cer-
tificate management. Most of the tasks can be done for them by untrusted servers
or helper software. Nikander and Viljanen [23] describe a way of storing certifi-
cates in the Internet domain name service and a discovery algorithm based on
[3].
The certificates should be considered as sets or graphs rather than as chains.
This is because the order of issuance of the certificates does not necessarily have
any correlation to their order in a particular chain. The time of issuance and the
vahdity periods depend on the local trust relations behind the certificates. These
are independent of whatever chains may be formed globally. If one certificate in
a chain expires, only that one needs to be refreshed immediately. The other
certificates in the chain remain valid. In this way, the system is distributed not
only in space but also in time. It is even possible that the private keys or the
owners of the keys who issued some of the certificates have ceased to exist by
the time the certificates are used as a part of a chain. This is commonly so when
temporary key pairs are used for anonymity or when a old system delegates its
tasks and rights to a replacing one.
Certificate reduction allows a trade-off between communication and certifi-
cate processing cost. Reduction requires one to contact the issuer of the reduced
certificate. This means that the entity must be on-line at the time of the re-
219
duction and that this on-hne system must be secure enough to hold the private
signature key. Luckily, the reduction engine can be fairly simple to build. If all
the certificates in the chain delegate the same rights or if the application has
straightforward rules for computing intersections of the authorizations, the re-
duction is a purely syntactical transformation and it can be fully automated. The
issuer of the reduced certificate does not need to fully understand the meaning
of the certificates. The reduction may save significantly in the costs of certificate
transfer and verification.
Unlike mandatory access control mechanisms, delegation does not have a
central reference monitor to supervise access and distribution of the access rights.
Neither is there a trusted computing base (TCB) to monitor the actions of the
distributed entities. This is because there is no global policy to enforce on the
parts of the system. In open environments like the Internet, it would not be
possible to implement any global controls. Instead, the policy is determined
locally by the parts.
Delegation leaves more to the discretion of the participants than many other
DAC mechanisms. The system does not enforce any policies to protect users from
bad decisions. All the verifier of a certificate sees is public keys and signatures.
It has no way of telling if the private keys belong to legitimate entities in the
system. The certificate issuer at the time of signing should, of course, have a
soHd reason for trusting the subject with the delegated rights. But the systems
leaves it to each key owner to judge by itself who can be trusted. The issuer
may or may not know the name or identity of the subject key owner. Moreover,
redelegation creates a new degree of freedom. When redelegation is allowed,
anyone can share his rights with others. This is equal to universal grant rights
from anyone to anyone.
It may seem that redelegation should be controlled. It is, indeed, possible to
add conditions on the certificates hmiting redelegation. In SPKI, the choices are
to allow or forbid redelegation completely. A certificate that forbids redelegation
can only be the last certificate in a certificate chain.
There are, however, appealing arguments for allowing free redelegation. It
may be, convenient for the client to authorize someone else to use the rights on
its behalf, or the client may want to redelegate the rights to one of its subsidiaries.
The internal organization of the clients of a service should not be a concern to
the service providers. Free redelegation makes the internals of the system parts
more independent thus furthering distribution.
The key-oriented nature of the system also obscures the semantical meaning
of the restrictions on redelegation. Only in special circumstances does the issuer
of a certificate know that the private part of the subject key is held permanently
secret by a certain physical entity. In many cases, the issuer accepts any key given
by the entity that is to receive the rights. In general, there is no guarantee that
the corresponding private key will not be revealed to others. Giving out the key
would spread the rights as effectively as redelegation but, unlike in redelegation,
the rights and their validity period could not be limited. Usually, there are also
220
other ways to redistribute the services without the agreement of the originator,
e.g. estabhshing proxy servers and outright duplication of the server data.
Instead of forbidding redelegation, the issuer of a certificate should consider
changing the authorization and validity period. Minimizing the scope of dele-
gated rights is the most natural way to express limited trust in the subject. The
certificates make it easy to reconsider the rights and validity in each step of a
delegation chain.
All in all, delegation does well the part of access control that is easy to im-
plement: maximally discretionary distribution of access rights. Mechanisms for
identity certification, limits on redistribution, rights transfer and revocation can
be added where they are required. However, such features in principle require
a more complex infrastructure with a TCB, tamper-resistant modules, trusted
third parties or on-line communication. Therefore, the basic access control sys-
tem should not require their use. We believe that there are many instances where
pure delegation is a sufficient mechanism and corresponds well to the real-life
access-control needs.
Delegation certificates are most suitable for use in distributed systems with no
globally accepted security policy or authority and in open systems with no central
registration of the servers and clients. They are unlikely to find apphcations in
high-security environments where mandatory access control and trusted systems
are a rule. Although most apphcations will be on the Internet, we will look at
an example from the telecommunications world.
Calypso [19] is a distributed service architecture for intelligent networks (IN).
It is designed for ATM access networks where a workstation running the Calypso
service platform controls an ATM switch. Calypso provides flexible distribution
of service and network control functions among service clients, servers and net-
work nodes. The same architecture could also reside on top of other types of
network equipment such as IP switches and firewalls.
Calypso is based on a business model where the network operator who owns
the infrastructure offers network resources to service providers (SP). These re-
sources are the lowest level of Calypso services. Service providers can either
market the right to use their services to end users or they can sell them to other
SPs for reselling or for use as building blocks of more sophisticated services.
Complex services and their components form tree-like structures (Fig. 4).
All Calypso services are implemented as Java pacltages. A service may use
other services by calling methods of the classes that belong to them. Hence, the
network nodes must have an access control system that facilitates execution of
code from mutually distrusting SPs and contracting of services between them.
The Calypso security requirements and a tentative architecture for satisfying
them were outlined in [6].
The IN access control mechanisms should encourage free formation of busi-
ness relations between service producers and merchants distributing access rights
221
(ServicelJ SP1
' Delegate
\ Service2 .'
A Delegate
I Service4
SP2 Delegate
Services
[services] (Service4j A
Delegate
SP3
\
SP4
\l
[services)
Services
SP5 A
Delegate
V
Delegate
Services
I
/
Delegate
Services I Services
\l V \L \ I /
(a) [ Low-level networ[< services J (b) Network operator
— Delegation certificate -
Service Service
provider 2" Keyl may access Service2. ^ provider 1
holds private Key2. Signed: Key2
holds private Key1.
I
~ Delegation certificate'
hash(Service1 code)
author or
may access Service2.
owner
Signed: Keyl
: ^ Service platform
Service2
owned by Key2. Servicel
Service
broker
holds private KeyB.
\
/ — Delegation certificate -
— Delegation certificate - Key1 may access Service2.
KeyB may access Service2. Signed: KeyB
Signed: Key2
Service / \ Service
provider 2 provider 1
holds private Key2. holds private Keyl.
process is not easy to organize because the aim is to allow fast development of
services by a large number of independent SPs. A possible solution is to use
independent quality-control (QC) units that certify services if they meet some
minimum quality criteria. In Fig. 7, the network operator grants access rights to
SPl's code only after receiving the review results. If the QC writes a certificate
to Keyl instead, that means it has reviewed the production process of SPl and
the network operator can trust SPl to do its own quality control.
Network
operator
holds private KeyN.
1 — Delegation certificate —
Network operator must see hash(Service1 code) may
the quality certificate from access Network services.
Signed: KeyN
KeyQ before signing.
Deiegatlon certificate -
Quality tiash(Service1 code)
control passed quality check.
liolds private KeyQ. Signed: KeyQ
Fig. 7. Code quality control for IN services with basic delegation certificates
The following sections introduce certificates with more complex structure than
the ones we have seen so far. The enhancements increase significantly the fiexi-
bility of certificates as an access control tool. Threshold certificates are a means
of dividing authority. Instead of giving the rights to a single subject, they are
given to a. group of subjects who must co-operate to use the rights. Sec. 3.1
describes the certificates and Sec. 3.2 explains how they are used. Conditional
delegation is a way of expressing simple access control policy rules in certificates.
224
We will introduce the new type of certificates in Sec. 3.3 and look at applications
in Sec. 3.4.
The usual way for the subjects to co-operate is to redelegate the rights to
one of them or to some other single entity. In Fig. 8, KeyC receives the right
R from KeyS because two subjects of the (2,3)-threshold certificate co-operate
to pass it to KeyC. When KeyC wants to use the right R, it must attach all
three certificates to its access request. It should be noted that the subjects of
the threshold certificate do not need to delegate directly to the same key as in
the figure. The delegation could go through independent or partially dependent
chains of certificates and even through other threshold certificates before the
shares are accumulated to a single key. The general structure of such networks
was studied in [4].
The threshold certificates can also be used in some situations where there
is no real threshold trust scheme. An example below will show how they may
improve distribution and flexibility of the system.
(A;,A;)-threshold certificates where all subjects are required to co-operate are
sometimes called joint-delegation certificates. Open threshold certificates [4] are
a variation of the threshold certificates where each subject is given a separate
certificate and new subjects can be added later.
— (2,3)-threshold certificate
Any 2 of {Keyl ,K6y2,Key3}
fiave all my rights.
Signed; KeyS
y
Key1
?
Key2
^ Keys
Keys that share the
autfiority of KeyS.
I
Delegation certificate
y
KeyC has right R. Delegation certificate
Signed: Key2 KeyC has right R.
T/
KeyC
Signed: Key3
This certificate allows any two of the new keys to operate on behalf of the
original master key. The new private keys are stored in separate places while the
old private master key is destroyed or stored in a safe place and never accessed.
This protects both against theft and accidental loss of the private master key.
If one share is compromised, it alone is not enough to misuse the service, and if
one share of the three is lost, the other two can still grant access to the service.
The SP can still advertise its old public key {K or KeyS) and receive new rights
delegated to that key. The threshold certificate passes all these rights to the
share keys who can authorize others. This kind of protection of private keys
may prove to be a much more common reason to use threshold certificates than
actual threshold trust schemes between business associates.
There is another, rather unexpected, application for the threshold certificates.
We will see that flexibility can be added to systems like that of Fig. 7 by encoding
an imphcation rule in a certificate. If Fig. 9, the quality-control key {KeyQ)
certifies the code by granting the code all its rights. The network operator issues
a (2,2)-threshold certificate for SPl and QC. Thus, SPl needs the agreement of
the QC before it can use the network services. The three certificates together
convey the right to access Network services from KeyN to Servicel. With these
certificates, Servicel can prove its access rights when installed into a network
node.
The QC key is never used for any other purpose than for certifying entities
that have passed the quality check. A quality certificate can be issued to code
or to service providers whose own quality control has passed an audit. In the
226
quality certificates, the QC delegates all its rights so that it does not need to
know what kind access the quality certificates are used for. This means that the
QC key should never be given any other rights than one share of a threshold
certificate for the purposes of a quality check on whoever gets the other share.
Network Service
operator provider 1
holds private KeyN. holds private Key1.
\ ^
- (2,2)-threshold certificate - — Delegation certificate —
Key1 and KeyQ together may hash(Service1 code) may
access Network services. access Networl< services.
Signed: KeyN Signed: Keyl
Quality
control
~y. Delegation certificate -
hash(Service1 code)
?
has all my rights. Servicel
holds private KeyQ. Signed: KeyQ
Compared to the use of basic delegation certificates (Fig. 7), the threshold
certificate has the advantage that dependences between the quality control and
the granting of access to the services have been reduced to minimum. It is not
necessary to involve the network operator every time code is changed and the
certificates can be signed in any order. The three certificates can be issued and
renewed independently of each other.
Admittedly, this is a somewhat inelegant way to use threshold certificates.
The rights delegated by the certificate are not encoded in the the authorization
part ("all my rights") but in the signing key {''KeyQ is the quality-control key").
If there are several different authorizations (e.g. several types of quality check),
the QC unit must have an equal number of signature keys. Moreover, comparing
different levels of authorization becomes difficult when the authorizations are
encoded in the keys. We observe that the threshold certificate in Fig. 9, in effect,
carries the meaning
"Keyl may access the network services if it also has
a quality certificate from KeyQ."
In section 3.4, we will introduce a new type of certificate that explicitly in-
cludes such conditions. The reason why we have described in length how to
encode the same meaning into a threshold certificate is that only threshold del-
egation is currently supported by SPKI.
227
Conditional certificates are like the basic delegation certificates except that they
state additional conditions that have to be satisfied before the certificate is con-
sidered valid. A conditional certificate is a signed message with the following
contents:
The certificate gives the rights to the subject only if all the conditions in the
list are satisfied. The conditions always take the same form: they require the
subject key of the certificate to have a certain authorization from a certain key.
This is natural because any attribute that the subject may have can be verified
only if it is expressed as an attribute certificate from a proper authority.
In order to use the certificate, the subject must provide a proof that the
conditions are fulfilled. It does this by attaching appropriate certificate chains,
one for each condition. The certificates in the proof of access rights form a tree
(or a directed acyclic graph) rather than a chain.
When certain attributes are required before granting access to a service, a
conditional certificate offers two advantages compared to the basic delegation
certificates. First, the certificate is an unambiguous, standard-form statement of
what kind of attribute certificates are still needed for the access. Secondly, the
conditional certificate can be signed before obtaining the attribute certificates.
Without conditional certificates, the client in need of access rights would first
contact the issuer to find out what are the prerequisites, then try to acquire them
and, in the end, return to the issuer with the collection of credentials in order
to get the new certificate. When the decision rule is encoded in a certificate,
the issuer needs to be contacted only once. The client may obtain the attributes
before or after this contact. Thus, communication and synchronization between
the entities is greatly reduced.
The conditional certificates express simple policy rules but they are, by no
means, a general language for defining policies. For example, the certificates pre-
sented here cannot express symbolic rules. A more general language for express-
ing conditions, policies and limits on redelegation is a topic of active research.
Conditional certificates are just the right tool for the kind of situations where we
shghtly abused threshold certificates in Sec. 3.2. The code quahty check for an
IN service can be expressed as a condition. Fig. 10 reformulates the certificates
of Fig. 9 with conditions. The result is functionally the same but the system is
much more intuitive for a designer or an observer.
228
With the threshold certificates, the quahty certificate told only indirectly
what kind of authority it gives to the subject. The authorization was encoded
in the signature key. In addition to being cumbersome to understand, this en-
coding has the disadvantage that the authorizations are not comparable. The
conditional certificates, on the other hand, explicitly state the rights or attributes
in the authorization field of the certificate and the authorizations can be com-
pared. For example, we can let the IN Network operator write the required level
of quality in the conditional certificate. If the QC issues a quality certificate with
the same or higher level to Servicel, the code will get the access rights.
- Conditional certificate — i
Network _ Service
operator
Keyl may access Network ' provider 1
service if it has "passed holds private Keyl.
holds private KeyN quality check" from KeyQ.
Signed: KeyN
- Deiegation certificate —
hash(Service1 code) may
access Network service.
Signed: Key1
Quality
control
Delegation certificate -
hash(Service1 code)
T
passed quality check. Servicel
holds private KeyQ. Signed: KeyQ
Fig. 10. Code quality control made simple with conditional certificates
4 Limitations
The flexibility of the certificates does not come completely without a price. There
are many security goals that require centrahzed control and cannot be realized
only with signed messages. We will consider such goals and see what kind of
central or trusted services they imply. Sec. 4.1 discusses policies that require
additional infrastructure. Sec. 4.2 brings up the issue of quantitative rights. In
Sec. 4.3 we consider revocation and in Sec. 4.4 anonymity and auditing.
The certificates are a form of discretionary access control. They cannot express
mandatory policies like the Bell-LaPadula model [7] because we do not assume
229
any mechanism for enforcing a policy globally. A mandatory policy would require
all equipment on the system to be under the control of some authority so that
they can be trusted to follow the pohcy.
Another limitation is that the certificates can only convey policies where
the rights of the entities grow monotonically as they acquire new certificates. It
is impossible to verify that someone does not have a certificate. Consequently,
separation-of-duty policies like the Chinese Wall policy [11] cannot be expressed
with only certificates. They need some mechanism for keeping track of the previ-
ously granted rights. Moreover, if several distributed issuers give out certificates
for different conflicting rights, these issuers must share a single view of the sub-
jects' histories. The histories must be updated in real time when new certificates
are issued. An equally difficult problem is that in a key-oriented system, one
physical entity might use several keys to gain conflicting rights. We must first
identify the entities whose duties we want to separate and then find a way of
mapping keys to unique identities. For example, a trusted official could certify
the keys to be unique personal keys of the participating persons. Altogether, sep-
aration of duty appears to be one of the greatest challenges for certificate-based
access control.
A related problem is the separation of access rights and grant rights. In a
key-oriented architecture, someone with only grant rights could easily subvert
the protection mechanism by issuing the rights to a key held by himself [15].
Therefore, every key in practice has the rights that it is allowed to delegate to
others. Like the separation of duty, separate policies for granting and using rights
cannot be securely implemented unless each entity has a unique identity and
keys owned by the entity are bound to the identity. Consequently, key-oriented
systems usually do not even try to implement pure grant rights.
An occasionally needed access control feature is a proxy that can issue cer-
tificates with longer life-times than its own existence. For example, a manager
on a vacation should be able to delegate authority to a stand-in only for the time
of the leave but the decisions made by the stand-in should stay valid longer than
that. Unfortunately, the proxy can create valid-looking certificates even after
losing its authority. He simply writes false dates on them so that they appear
to be signed at the time when he was authorized. The problem can be solved
with the help of a trusted time-stamping service if such a centralized authority
exists. Often it is easiest to let the certificates expire when the mandate of the
proxy does and have the master entity revalidate them.
The above limitations are due to fundamental properties of the access control
mechanism. There are, however, some other respects in which we have deliber-
ately satisfied us with less than the maximal expressive power. For example,
symbolic expressions could be allowed in conditional certificates. The extension
would make it possible to express general rules while the certificates proposed in
this paper can only speak of fixed keys. The reason for presenting the less gen-
eral model here is that it solves the practical problems with the basic delegation
certificates that we have met in applications. Future work on symbolic condi-
tions will determine the extent to which they are worth the increased complexity.
230
The question of the optimal expressive power for the certificates involves issues
of computational and communication complexity and typical usage patterns in
applications.
A question that we will leave open is the exact structure of the authoriza-
tions. The types of access rights and the policies for combining them depend
on the application. In Policy Maker [8,9], the authorizations are expressed as
small programs of a safe programming language and certificates can communi-
cate with each other. This maximally generic approach leads to concerns about
the tractabiUty of access control decisions [10]. But no matter how the authoriza-
tions are encoded, certificates have one general limitation in this respect: they
can effectively express qualitative authorizations but not quantitative. That is
the topic of the next section.
1. Make the charging recursive. Every reseller will be responsible for collecting
payments from the clients it delegated rights to or for dividing quotas be-
tween them. Apart from physical control of the clients, there are two ways in
which the reseller can divide the services and the costs between its clients.
(a) The reseller divides the service capacity at its disposal into smaller time
slots and more specific methods of access. Because the authorizations
are refined to be suitable only for very narrow purposes, the clients must
repeatedly request new certificates from the reseller who collects usage
data.
(b) The authorized clients may use the services at any time and the server
collects usage data. The usage statistics and the certificate chains that
were used as proof of access rights are propagated from the server down
the tree of resellers.
2. Require the client keys in the system to be certified by a trusted entity that
guarantees their payments or gives them a credit rating. The server verifies
231
the credit before allowing access and collects payments directly from the
clients. This may not stop the sharing of access rights but it means there is
someone to pay for the metered usage.
3. Require the participants to incorporate a tamper-resistant police module on
their systems. Only keys on the tamper-resistant modules are allowed to
participate in the distribution of the access rights. The module can do the
accounting or enforce whatever limitations are wanted.
All these techniques incur a cost in that they require additional infrastructure
and make parts of the system less independent. But these costs are inherent to
any accounting and charging mechanism. With delegation certificates, we can
decide separately for each application if such measures are needed and if their
cost is acceptable.
In Calypso, we have chosen the approach 1(b) because the delegation chains
between SPs are relatively short (often only one step) and the charging for
component services is arranged in the same tree-hke manner. Access rights and
payments flow in the opposite directions in the service composition tree (Fig. 4).
A problem related to accounting is access rights transfer where an entity giv-
ing rights to others loses them itself. From a chain of certificates, it is not possible
to see if the chain ends there or if the rights have been redelegated further. Thus,
an entity that has redelegated its rights can still use them. Implementing trans-
fer in a distributed system requires a TCB or tamper-resistant police modules.
(The tamper-resistant module can, in fact, be thought of as a TCB.) One such
system for the transfer of software licenses between tamper-resistant smart cards
is described in [5].
Sometimes an entity distributing access rights may want to reverse its decision
after the rights have already been granted. The change in mind may be due
to changed circumstances or more accurate information about the subject. In
a certificate-based architecture, this means invalidating certificates after they
have been issued but before their expiration dates. In general, any decrease in
the trust placed on the subject may require the issuer to sign and new certificate
and to cancel the old one.
In systems where all access is controlled by one reference monitor, access-right
revocation is a simple matter. It suffices to update the access control lists or to
store information on the exceptions at the place where the rights are verified.
In a distributed system, ACLs are stored and decisions to grant access are
made in more than one place. It is necessary to propagate the information on re-
voked access rights to all these places. The communication causes unpredictable
delays and, consequently, real-time revocation cannot be achieved like in a cen-
tralized system. An efficient infrastructure for propagating the revocation infor-
mation is a central part of many access control systems (for example [17,21]).
Some options are to broadcast revocation events or to notify only the interested
232
The signature key does not directly reveal who is responsible for a signed
request which makes it difficult to trace the responsible parties. For auditing,
the keys must be bound to the persons or legal entities that are liable for their
actions. That kind of bindings can be created by identity-escrow agents that
guarantee to find a responsible person if the need should arise. The escrow agents
issue certificates to the keys whose identities they have escrowed. The services
that require auditing only accept clients with the escrow certificates.
The key-oriented system protects the users' privacy by not explicitly revealing
their names. However, the keys are easily recognizable identifiers that can be used
to combine data collected from different sources. Therefore, further measures
are needed for reliable privacy protection. The certificate reduction (see Sec.
2.2) helps in some cases. A chain of certificates may reveal the identities of the
intermediate entities but when the chain is reduced, that information is hidden.
SPKI puts great emphasis on privacy aspects and relies mostly on the reduction.
An alternative anonymity technique is to create temporary keys that do
not reveal their owner. When a subject entity wants its anonymity protected,
it provides the issuer of a new certificate with a freshly generated public key.
The temporary keys cannot be recognized and connected to the owner or to
each other. This is often preferable to certificate reduction because the entity
responsible for generating the temporary keys is the one whose anonymity is at
risk. Although the generation of the temporary keys is costly, it can be done
off-line in advance. With both techniques, however, the cost of privacy is an
increase in communication and synchronization between entities.
5 Conclusion
This paper described delegation certificates and some of their applications in
distributed access control. The goal was an abstract understanding of the basic
ideas without implementation details. We found that the main advantages of
the certificates he in decentralization. We also introduced conditional certificates
that help in further distribution of management operations in the system.
Delegation catches well the spirit of what is natural to access control of
distributed digital services. Some access control policies require additional in-
frastructure such as a TCB or trusted servers. We feel that such costs should
be avoided wherever possible. When their limitations are kept in mind, the del-
egation certificates can satisfy many every-day access control needs and can be
used as a uniform basis for distributed discretionary access control.
Acknowledgements
The work was funded by Helsinki Graduate School for Computer Science and
Engineering (HeCSE) and Academy of Finland. In am thankful to professor OUi
Martikainen and to Petteri Koponen and Juhana Rasanen for allowing the use
of Calypso as a case study. Part of the work was done while the author was at
UC Davis Computer Security Laboratory.
234
References
1. Martin Abadi. On SDSI's linked local name spaces. In Proc. 10th IEEE Computer
Security Foundations Workshop, pages 98-108, Rockport, MA, June 1997. IEEE
Computer Society Press.
2. Martin Abadi, Michael Burrows, Butler Lampson, and Gordon Plotkin. A calculus
for access control in distributed systems. ACM Transactions on Programming
Languages and Systems, 15(4):706-734, September 1993.
3. Tuomas Aura. Fast access control decisions from delegation certificate databases.
In Proc. 3rd Australasian Conference on Information Security and Privacy ACISP
'98, volume 1438 of LNCS, pages 284-295, Brisbane, Australia, July 1998. Springer
Verlag.
4. Tuomas Aura. On the structure of delegation networks. In Proc. 11th IEEE
Computer Security Foundations Workshop, pages 14-26, Rockport, MA, June 1998.
IEEE Computer Society Press.
5. Tuomas Aura and Dieter GoUmann. Software license management with smart
cards. In Proc. USENIX Workshop on Smartcard Technology, Chicago, May 1999.
USENIX Association.
6. Tuomas Aura, Petteri Koponen, and Juhana Rasanen. Delegation-based access
control for intelligent network services. In Proc. ECOOP Workshop on Distributed
Object Security, Brussels, Belgium, July 1998.
7. D. Elhott Bell and Leonard. J. LaPadula. Secure computer systems: Unified ex-
position and Multics interpretation. Technical Report ESD-TR-75-306, The Mitre
Corporation, Bedford MA, USA, March 1976.
8. Matt Blaze, Joan Feigenbaum, John loannidis, and Angelos D. Keromytis. The role
of trust management in distributed systems security. In J. Vitek and C. Jensen,
editors. Secure Internet Programming: Security Issues for Distributed and Mobile
Objects, LNCS. Springer-Verlag Inc, New York, NY, USA, 1999.
9. Matt Blaze, Joan Feigenbaum, and Jack Lacy. Decentralized trust management.
In Proc. 1996 IEEE Symposium on Security and Privacy, pages 164-173, Oakland,
CA, May 1996. IEEE Computer Society Press.
10. Matt Blaze, Joan Feigenbaum, and Martin Strauss. Compliance checking in the
PolicyMaker trust management system. In Proc. Financial Cryptography 98, vol-
ume 1465 of LNCS, pages 254-271, Anguilla, February 1998. Springer.
11. David F. Brewer and Michael J. Nash. The Chinese wall security policy. In Proc.
IEEE Symposium on Research in Security and Privacy, pages 206-214, Oakland,
CA, May 1989. IEEE Computer Society Press.
12. Recommendation X.509, The Directory - Authentication Framework, volume VIII
of CCITT Blue Book, pages 48-81. CCITT, 1988.
13. CaxI M. Ellison. Establishing identity without certification authorities. In Proc.
6th USENIX Security Symposium, pages 67-76, San Jose, CA, July 1996. USENIX
Association.
14. Carl M. Ellison, Bill Franz, Butler Lampson, Ron Rivest, Brian M. Thomas, and
Tatu Ylonen. SPKI certificate theory, Simple public key certificate, SPKI examples.
Internet draft, IETF SPKI Working Group, November 1997.
15. Carl M. Ellison, Bill Franz, Butler Lampson, Ron Rivest, Brian M. Thomas, and
Tatu Ylonen. SPKI certificate theory. Internet draft, IETF SPKI Working Group,
October 1998.
16. M. Gasser, A. Goldstein, C. Kaufman, and B. Lampson. The digital distributed
system security architecture. In Proc. National computer security conference, pages
305-319, Baltimore, MD, USA, October 1989.
235
17. Li Gong. A secure identity-based capability system. In Proc. 1989 IEEE Sympo-
sium on Research in Security and Privacy, pages 56-63, Oakland, CA, May 1989.
IEEE, IEEE Computer Society Press.
18. J. Kohl and C. Neuman. The Kerberos network authentication service (V5). RFC
1510, IETF Network Working Group, September 1993.
19. Petteri Koponen, Juhana Rasanen, and OUi Martikainen. Calypso service architec-
ture for broadband networks. In Proc. IFIP TC6 WG6.7 International Conference
on Intelligent Networks and Intelligence in Networks. Chapman & Hall, September
1997.
20. Ilari Lehti and Pekka Nikander. Certifying trust. In Proc. First International
Workshop on Practice and Theory in Public Key Cryptography PKC'98, volume
1431 of LNCS, Yokohama, Japan, February 1998. Springer.
21. Nataraj Nagaiatnam and Doug Lea. Secure delegation for distributed object envi-
ronments. In Proc. 4th USENIX Conference on Object-Oriented Technologies and
Systems (COOTS), pages 101-115, Santa Fe, NM, April 1998. USENIX Associa-
tion.
22. A guide to understanding discretionary access control in trusted systems. Technical
Report NCSC-TG-003 version-1, National Computer Security Center, September
1987.
23. Pekka Nikander and Lea Viljanen. Storing and retrieving Internet certificates.
In Proc. 3rd Nordic Workshop on Secure IT Systems NORDSEC'98, Trondheim,
Norway, November 1998.
24. Ronald L. Rivest and Butler Lampson. SDSI — A simple distributed security
infrastucture. Technical report, April 1996.
25. Edward P. Wobber, Maiti'n Abadi, Michael Burrows, and Butler Lampson. Authen-
tication in the Taos operating system. ACM Transactions on Computer Systems,
12(l):3-32, February 1994.
26. Philip Zimmermann. The Official PGP User's Guide. MIT Press, June 1995.
A View—Based Access Control Model for
CORBA
Gerald Brose
Abstract. Specifying and managing access control policies for large dis-
tributed systems is a non-triviai task. Commonly, access control policies
are specified in natural language and later reformulated in terms of a par-
ticular access control model. This paper presents and discusses concepts
for an object-oriented access model that is more suitable for describing
access control policies for CORBA objects than the default access model
specified in the OMG security service specification.
1 Introduction
Specifying and managing access control policies for large distributed systems is
a non-trivial task. Commonly, access control policies are initially specified in
natural language and later reformulated in terms of a particular access control
model, e.g. a label-based or access matrix-based model. Because controlling the
dynamic evolvement of policies and adjusting them by administrative operations,
i.e. managing them, is highly sensitive and at the same time error-prone if done
at too low a level of abstraction, language support for this kind of activity would
be desirable.
As has been pointed out above, the language for specifying access control
policies is defined by the underlying access model. If the concepts and abstrac-
tions of this model are not well designed, this has direct bearing on the quality
of the policies that are to be written using that model.
In this paper, we take a software engineering or language-based approach
to designing access policies and present an alternative access control model for
CORBA [16] systems. We believe this model is better suited to express advanced
policy concepts than the default access model specified in the OMG's Security
Service Specification [17]. Also, it provides policy designers with more abstract
concepts for specifying authorizations. This model is based on the concept of
views, an object-oriented approach to specifying access rights.
The rest of this paper is organized as follows. Section 2 outlines the CORBA
access control model and examines the default access control policy as defined in
the Security Service. Section 3 introduces the concept of views and sketches how
it can be used in implicit authorizations and for delegation. Section 4 mentions
related work and Section 5 concludes the paper with a summary and an outlook
on future work.
238
2.1 Concepts
We briefly explain and review the main concepts of the default access model.
Principals. CORBA refers to both human users and system entities acting on
their behalf a principals. Authorizations are granted to principals only indirectly
on the basis of the security attributes they possess rather than directly using
identities. Possible security attributes include (but are not limited to) access
identifier, group and role names, clearance level and capabilities. If a number
of principals share a common attribute value, they implicitly form a group to
which access rights can be granted.
Domains. Objects are grouped into security policy domains, i.e. sets of ob-
jects to which the same security policies apply. For every domain, there is a
DomainManager object that knows about the pohcies and the members of the
239
Rights and operations. The security service defines individual rights in rights
families. The default rights family is corba and contains four generic rights: g,
s, m, u for get, set, manage and use. The definition of new rights families, albeit
possible, is discouraged to keep policies simple.
Authorizations are checked per individual operation. The default access model
defines no explicit grouping construct for operations. Operations are, however,
grouped implicitly by the access rights they require. These required rights for
an object access using a particular operation are to be defined by interface de-
velopers and specified per object type. Thus, the specification of required rights
defines a mapping from generic access rights to actual operations. Required rights
are defined as a combination of rights by stating whether the given rights are
required in conjunction ("air'-combinator) or whether any of the listed rights
is sufficient for the operation ("any"-combinator). They are stored in a global
table and are not part of any individual access policy.
Using a domain's DomainAccessPolicy object, a particular access policy
is defined by granting effective rights to principals identified by their security
attributes. One implicit security attribute that is set by the ORB for every
access is the delegation state. This attribute allows to identify for every request
whether the principal is the initiator of the request or an intermediate in a call
chain. It is thus possible to grant a different set of effective rights depending
on the delegation state of the calling principal. Effective rights are registered in
tables and compared to the required rights for an operation upon access. If a
principal's effective rights match the required rights, the access is allowed.
We will sketch a hypothetical access policy for naming context objects using the
CORBA default access model. Naming contexts have the interface CosNaming: :
NamingContext from the OMG's name service specification [15]. The IDL for this
interface is given in Figure 1. The policy is to distinguish between different uses
240
module CosNaming {
interface NamingContext {
void bindCin Name n, in Object obj);
void rebind(in Name n, in Object obj);
void bind_context(in Name n, in NamingContext nc);
void rebind_context(in Name n, in NamingContext nc);
Object resolve (in Name n ) ;
void unbindCin Name n ) ;
NamingContext new_context();
NamingContext bind_new_context(in Name n) ;
void destroy();
void list (in unsigned long howjnany,
out BindingList bl, out Bindinglterator bi);
};
Please note that the hierarchical nature of these authorization types is in-
cidental and simply a consequence of the example policy we chose to present
here, it is not characteristic for our approach. While authorizations of the first
two types would be considered " public", authorizations of type 3 and 4 are only
granted to privileged users of two different groups. The last authorization is to
be granted to a small group of administrators only.
241
Defining Required Rights. Before this pohcy can be expressed in the de-
fault CORBA access model and effective rights can be granted to holders of
the appropriate security attributes, required rights for each operation of the
NamingContext interface have to be defined. The rights combination for each
operation can be specified using rights from
1. the default corba rights family exclusively.
2. another rights family, perhaps one that was introduced for exclusive use with
the type NeimingContext.
3. different rights families.
To keep management of rights simple, it would be desirable to refrain from
defining new rights families and select the first of these options. However, it is
not always straightforward how to model the intended semantics of authoriza-
tion using the generic rights "get", "set", "manage" and "use". Note that these
rights do not just describe distinct operation types (g,s,ii) but also the level of
sensitivity for operations (the m right) [1].
A more serious limitation with using only the small set of rights from the
corba family is that no more than 16 different combinations of rights can be
distinguished. Thus, it is highly Ukely that a number of operations will require
the same set of rights, especially if required rights are defined by individual in-
terface designers and not by security administrators and the domain contains
objects of a number of different types. If required rights are not unique, however,
ensuring the principle of least privilege is very difficult if not impossible with
this access model. Because rights are generic principals can invoke any opera-
tion on any object of any interface in the domain that happens to match the
effective rights combination granted to them. As an example, imagine that the
operation l i s t O on naming contexts and the operation c r e a t e O on some ob-
ject of type JPEGFactory were to require corba :g and corba :u. Any principal
that was granted this combination of required rights in a domain where objects
of these types exist would be allowed both operations and possibly others as
well. Determining which operations will be permitted with a given set of rights
implies searching the whole required rights table because grouping operations by
their required rights can only be done implicitly.
The second of the options outlined above would avoid this problem of in-
terfering rights altogether. Also, defining a new rights family per IDL interface
would allow policy designers to use access rights that directly correspond to the
intended usages of these interfaces. However, introducing new rights families not
only complicates management, it is also very cumbersome in practice as there
are neither language support nor management interfaces for this task.
As a compromise, the basic corba rights could be used in conjunction with a
minimal set of newly introduced, type-specific rights that help ensuring unique-
ness of required rights. For our example, we introduce a new rights family naming
with two rights n, m for naming, and manage. The right naming: n is used to
make this required rights combination type-specific, the second is necessary to
distinguish between the last two authorization types. These are AND-combined
with the default corba rights to specify the required rights as in Figure 2.
242
Note that the rights required for, e.g., operation b i n d O include those re-
quired for l i s t 0 . This is to model the policy feature that whoever has authority
to invoke b i n d O can also invoke l i s t ( ) . Also note that to model this property
of the pohcy, we had to give up using the four corba rights as a consistent
classification of operation types: Operation bind() had to require corba:gsu-
although it does not read the state of a naming context, as the presence of the
right g would suggest. The example shows that this access model does not allow
to reconcile the intended semantics with a descriptive and readable specification.
Third, the default access model does not define any structural relations for
its access control concepts, e.g. hierarchies of groups, rights or domains. Domain
hierarchies are mentioned in the specification but not defined. Also note that the
management interfaces to the default domain access control model are at a low
level of abstraction and do not offer any language support for the sensitive and
error-prone ta.sks of specifying and managing access rights. Other than domains,
no abstractions are defined that would help structuring large specifications.
3 Views
To remedy the problems pointed out in the previous section we propose a new
access model based on views. A view definition introduces an authorization type
as a named subset of an object type's operations. Thus, individual access rights
directly correspond to operations. Access policies can be described in terms
of principals holding view-typed authorizations on objects. To allow for fine-
grained access control, views can be granted either on a single object of that
type or on all objects in the type's extension that belong to the domain. If a
principal requests access to an object via one of its operations and this operation
is contained in one of the views on that object held by the principal, the access
is allowed.
Note that, although this access model appears to call for a capability-based
implementation, this is not an inherent requirement of the model. 'While individ-
ual authorizations, i.e. view instances, are always related to either a single object
or to an open set of objects (a type extension), they are typed, first-class objects
in their own right. Thus, an implementation of this model is free to represent
the relation between authorizations and objects in any way of its choosing.
Views, i.e. authorization types, help structuring access policies by defining
different usages for object interfaces from the point of view of a security policy
designer. Policy design will obviously benefit from existing use-case diagrams or
role models that describe the intended object interactions from the application
designer's point of view. In spite of this close relation between the design of ob-
ject interfaces and views, it is desirable to keep them separate for two reasons.
First, decoupling interface design from security aspects allows to reuse applica-
tions in contexts with different security requirements or, vice versa, to design a
security policy for existing CORBA applications without having to rewrite their
IDL specifications. Second, it enables us to provide specially tailored abstrac-
tions for describing access pohcies without having to integrate these with other,
orthogonal language features. In particular, we do not want to extend CORBA
IDL with security annotations.
The authorization types for the example policy from the previous section can be
described directly and conveniently as views as shown in Figure 3.
244
Views are defined as access controls for a particular IDL interface, which is
referenced in the controls-clause of the view definition. In the example, the base
view CosNamingResolver controls the IDL type CosNaming: :NamingContext.
As a consequence, authorizations granted to give access to objects of one type
cannot interfere with other authorizations because of different types, even if
operation names happened to be identical.
View definitions can be related through inheritance so that their specifica-
tions can be reused. This relation between views is expressed by listing parent
views after the colon. In the example, all view definitions directly or indirectly
extend CosNamingResolver, so they inherit the operation r e s o l v e allowed by
CosNamingResolver and additionally permit the operations listed in their own
definitions. View inheritance is monotonic in the sense that extending views may
only add access permissions.
Apart from syntactic convenience, view definitions have the advantage that
operations are explicitly grouped. It is thus always obvious which operations
will be permitted if a particular view on an object is granted to a principal.
245
Being able to give names to sets of individual rights also adds descriptiveness as
these names can refer to different usage aspects of a particular IDL type. Note
that views can also be regarded as divisions of IDL interfaces according to the
intended uses of these interfaces, or as retrospectively defined interfaces.
As in the original CORBA access model, authorizations can be granted to
principals indirectly based on the value of some security attribute. The security
service does not mandate any particular way of defining and managing a mapping
of security attributes to users. For syntactic convenience, we suggest to introduce
simple identifiers for predicates over security attributes or combinations thereof
and use these predicate names to denote those principals for which the predicate
holds true.
The view-based access model introduced in this section provides a structured
way of defining rights that is better suited to express complex policies than the
original concept of rights, which was not structurally related to object types. In
the following, we will extend the expressiveness of the view language with con-
cepts that address implicit authorizations and denials as well as the delegation
of access rights.
may be defined ad hoc. They are not considered for resolving authorization con-
flicts.
First, an authorization is more specific than another if it is defined by a view
that is more specific than the view used to define the other, i.e. if the first view
extends the second. An example for such a case is given in Figure 4. Assuming
an IDL definition for a type T with operations op_l to op_5, we may specify a
view BaseView that grants access to operations op_l and op_2 while explicitly
denying op_3 and op_4. Moreover, op_3 is declared as a strong authorization with
the keyword f i n a l , so it cannot be overridden in refining views. A view definition
that attempted to do so would be rejected. DerivedView refines BaseView and
is thus more specific, so its authorizations will override the ones inherited from
BaseView in case of conflicts. Such a conflict arises because DerivedView allows
op_4, which was inherited from BaseView as denied. Additionally, DerivedView
allows to access objects of type T using op_5.
Our access control model unifies the concepts of operation and access right: View
definitions list operations, and holding a particular view on an object permits
to access this object with precisely the operations listed in that view. Since
operations are indivisible, so are access rights. While we believe this model to be
more appropriate for object-oriented systems than approaches which separate
access rights from operations, our model lacks the flexibility offered by the option
of combining access rights to obtain a certain privilege.
As an example, consider an access control policy for a safe with multiple locks
that states the safe can only be opened when a certain number of keyholders
cooperate. Accessing the safe should require multiple rights which correspond to
the keyholders' keys. Using the model outlined so far, however, we cannot express
249
When a principal tries to open a Safe object, the access decision function
checks not only that the principal holds a view that lists open as allowed, but
also whether the principal possesses the additional authorizations k e y l , key2
and key3 for this object that are required by open. If it does not, the principal's
access will be denied and it will have to obtain the missing rights, e.g. by having
other principals grant them. In cases like this, where the cooperation between
principals requires the exchange of access rights for a single operation invoca-
tion, the introduction of further delegation restrictions appears necessary so that
granting principals could specify how often access rights may be used or for how
long a granted view remains valid.
250
4 Related work
Using type information for specifying fine-grained object protection is, of course,
not new. Language based-approaches such as [11,19], typically rely on an ex-
tended notion of type that allows to selectively hide some of an abstract data
type's operations for access control. We are not, however, aware of approaches
that establish a separate type-concept for authorizations.
Views are a well-known concept in relational and object-oriented databases
[20]. Their use for access control purposes resembles the use of type abstraction
as a protection concept. Unlike database views that can span multiple types, a
view in our model is restricted to objects of a single IDL type. Joining views
on different IDL types Ti,...,T„ can, however, be modelled by specifying an
additional IDL interface T that extends Ti,...,T„ and defining a view on T.
Another difference is that database views may define content-specific access
controls, e.g. by stating that an attribute may only be read if its value is above
a certain threshold. This is not possible in our model.
In [10], grouping concepts for objects, principals and access rights have been
proposed in the context of downloadable code, but without proposing a language
for the specification of access control pohcies. While this model also addresses
implicit and negative authorizations, rights delegation and combination are not
an issue.
In [8], views are lists of allowed operations that are used to define the access
rights contained in capabilities. By recursively annotating parameters of view
operations with other view names in interface definitions, protection require-
ments for arguments can be expressed. At run-time, access rights according to
these views are passed implicitly. Structural relations between views and nega-
tive rights are not addressed.
We have analyzed the CORBA default access model and identified a number
of restrictions that make it desirable to replace this access model. We have
presented and dicussed views as the basic concept of a new access control model
that will allow policies to be expressed at a higher level of abstraction. We have
also sketched how implicit authorizations and delegation could be expressed in
access policies based on this model.
We are currently working on an implementation of the concepts presented in
this paper as part of a partial security service implementation for our own Java
implementation of the CORBA specification, JacORB [3]. In this implementa-
tion, view definitions are compiled into runtime representations that reside in
a view repository. The function of such a view repository is similar to that of
the CORBA interface repository. We are currently evaluating the possibility of
representing authorizations in the form of SPKI certificates [5]. Future work in-
cludes formalizing the concepts presented in this paper and exploring how role
models could be included.
251
Acknowledgements
I would like to thank Peter Lohr and Richard Kemmerer for valuable discussions.
References
Christian Tschudin
Most network services today are provided by stationary programs, either at the appli-
cation level (e-mail) or at the network level (routing). Programmable networks enable
to reconfigure the network's nodes and to bind servers to new physical locations at
run-time. "Network-aware services" may choose different server locations for optimiz-
ing the quality of service [5]. Similarly, application level gateways were proposed that
can perform transcoding or downgrading of multimedia data [1]. Within such proxy
architectures, the thin clients - typically mobile devices with wireless links to the fixed
network - can program the gateway by uploading servlets. In both cases we have rather
large servers and services with a limited amount of mobility.
By reducing the granularity of mobility, we can think of distributed network services
that consist of many tiny mobile programs going forth and back. Maintaining routing
tables, for example, does not require large server programs sitting somewhere in the
network: Small active packets can collect and propagate topology changes to all nodes,
updating the routing tables directly. A natural form of mobility of such services consists
in having them look around for "unserved" nodes to which the service will attempt to
extend. Such services will self-deploy and form a floating, gas-like service cloud that
reaches into every niche of the network [ 12]. It is also possible that such a service cloud
retracts from selected nodes because offering the service is not viable anymore for some
places due to increased resource competition or lack of clients. For the rest of this paper
we will focus on such fine-granular mobile code based network services and call them
highly distributed mobile services (HDMS).
254
2 Apoptosis
Research in biology has revealed that cells have a limited capacity to divide (mitosis).
This is not due to physical limitations - like for example exploiting some resource
beyond usability - but is a predetermined, intrinsic behavior of the cell. Mechanisms at
the molecular level are in place that can trigger the self-destruction of a cell. Several
reasons have been identified why it makes sense that a cell commits suicide (two recent
publications that give an overview of this field are [9,2]);
Cell death is as important as mitosis:
During the growth of an organism and the specialization of cells it is necessary
that some cells yield the place they occupy. Fingers, for example, are formed by
apoptosis of the tissue between them. Another example is the formation of the con-
nections between neurons in the brain that requires that surplus cells be eliminated
by apoptosis.
Combating ceils infected with a virus:
Cytotoxic T lymphocytes (CTL) can kill virus-infected cells by inducing apoptosis
i.e., killing the cell and the virus.
255
The important aspect of apoptosis is that the cell's self-destruction proceeds in a pro-
grammed and controlled way. The cell starts to shrink, it decomposes internal structures
and degrades all internal proteins. Then the cell breaks into small, membrane-wrapped
fragments ("fall off") that later on will be engulfed by phagocytic cells for recycling (see
figure 1). This process does not induce inflammation, no toxic substances are leaked to
the cell's environment.
Fig. 1. Two forms of cell death: Programmed cell death (apoptosis) and necrosis due to injury.
The other form of cell death is necrosis, the wn-programmed death of a cell. This
happens if a cell is injured, either by mechanical damage or exposure to toxic chemicals.
The response will be that the cell swells because the membrane's function is disrupted.
256
The cell contents leaks out which eventually leads to an inflammation of the surrounding
tissue until phagocytic cells removed all cell debris.
In cell biology there are two ways known how to induce apoptosis. A cell can be led
to commit suicide either by withdrawing specific positive signals i.e., signals needed for
continued survival like grow factors. Or, apoptosis is triggered by negative signals like
increased level of oxidants within the cell, UV light, X-rays, chemotherapeutic drugs as
well as death activators specifically installed to induce apoptosis.
Surprisingly enough, the protection against an "unauthorized" triggering of a cell's
self-destruction mechanism is not the major issue in biology. Because viruses as well as
cancer genes require a cell to function for their own procreation, they will rather try to
avoid destroying their host. This means that there is a natural selection of benign genes
that are able to block apoptosis procedures. The battle between viruses and the immune
system thus is rather a question of keeping the apoptosis mechanism intact.
In the context of active networks we have a different constellation. Active packets
in general do not vitally depend on each other but execute independently of each other.
Instead of a parasitic relationship there is competition between them, making the elim-
ination of a rival mobile service a valuable goal: The apoptosis entry point of a mobile
service would be an primary target for an attack. In the following section we will show
a simple way to protect this deadly entry point.
key := read() ;
result := decrypt(key, ENCRyPTED_CODE);
IF is_valid(result) THEN
execute(result);
FI
3.2 An Implementation
The simple code transformation presented above was implemented in July 1997 for the
MO system [11]. MO lends itself very well for this because of its interpreted nature
and the easy with which code can be generated at runtime. A simple procedure was
written that takes a pass-phrase together with the MO code that is to be secured. The
pass-phrase is hashed using the MD5 digest algorithm, the lower 64 bits are used as
the DES key for encryption. The code, together with a nonce value and a message
digest of the code, is then encrypted and emitted inside the "wrapper" code as described
in section 3.1. The wrapper consists in a MO procedure that takes an environmental
data item, applies the hash function, decrypts the given string constant, checks for the
correct digest value and, if correct, requests the execution of the decrypted code. The
transformation procedure itself fits in 12 lines of MO code.
259
4 Discussion
The presented welding together of an equality test with the conditional code is described
in [7], as are further possible code transformations that enhance the "ciuelessness" of
mobile code. Note that the presented approach works for terminating a distributed ser-
vice because we do not require the privacy of the apoptosis code to be maintained be-
yond the moment where we want the service to terminate. To obfuscate or completely
hide the functionality of a piece of executable code for longer time spans is a much
harder problem [3,4,8]. Our hope is that more simple building blocks as the one pre-
sented in this paper will be discovered, making security sensitive software fully mobile
and deployable in unsecured environments. Active networks will be an important appli-
cation domain for this.
The analogy proposed in this paper between the programmed death of a cell and the
self-termination of a distributed service points to the question on how active networks
will be steered in the future. From an engineering point of view it makes sense that
if we are required to firmly control the execution of mobile services, then we should
built the support for this into the active network substrate. Thus, the approach would
be to create privileged control channels that are different (or even separated) from the
rest of the network infrastructure. However, it may be difficult, if not impossible, to
come up with a universally applicable, reliable and scalable termination service. As an
indication of the problems to be expected one can look at the related problem of finding
the reliable broadcast protocol, or the effective support for distributed debugging. In our
research we will continue to handle control and data at the same level, as the insights
from biology suggest.
260
References
1. Amir, E., McCanne, S. and Katz, R.: The Media Gateway Architecture: A Prototype for
Active Services. Proc. SIGCOMM'98, Vancouver, Canada, 1998.
2. Bowen, I.D., Bowen, S.M. and Jones, A.H.: Mitosis and Apoptosis - Matters of Life and
Death. Chapman & Hail, 1998.
3. Collberg, C , Thomborson, C. and Low, D.: Manufacturing Cheap, Resilient, and Stealthy
Opaque Constructs. Proc. Principles of Programming Languages 1998 (POPL'98), San
Diego, California, Jan 1998.
4. Hohl, R: Time Limited Blackbox Security - Protecting Mobile Agents From Malicious
Hosts. In Vigna, G. (Ed.): Mobile Agents and Security. LNCS 1419, Springer, April 1998.
5. Ranganathan, M., Acharya, A., Sharma, S., Saltz, J.: Network-Aware Mobile Programs.
Proc. USENIX 97, Anaheim, California, USA, 1997.
6. Ray, T: An Approach to the Synthesis of Life. In Langton, C , Taylor, C , Farmer, J. and
Rasmussen, S. (Eds): Artificial Life II, Redwood City, CA, 1991.
7. Riordan, J. and Schneier, B.: Environmental Key Generation Towards Clueless Agents. In Vi-
gna, G. (Ed.): Mobile Agents and Security. LNCS 1419, Springer, April 1998.
8. Sander, T. and Tschudin, C: Towards Mobile Cryptography. Proc. IEEE Symposium on
Security and Privacy, Oakland, May 1998.
9. Sluyser, M. (Ed): Apoptosis in Normal Development and Cancer. Taylor & Francis, London,
1996.
10. Todd, M.: Artificial Death. Proc. 2nd European Conference on Artificial Life (ECAL93),
Brussels, Belgium, 1993.
11. Tschudin, C: The Messenger Environment MO - a Condensed Description. In Vitek, J. and
Tschudin, C. (Eds): Mobile Object Systems - Towards the Programmable Internet. LNCS
1222, Springer, April 1997.
12. Tschudin, C : A Self-Deploying Election Service for Active Networks. To appear in Proc.
3rd Int. Conference on Coordination Models and Languages, Amsterdam, April 1999.
LNCS, Springer.
A Sanctuary for Mobile Agents
Bennet S. Yee
1 Introduction
The Sanctuary project at UCSD is building a secure infrastructure for mobile
agents, and examining the fundamental security limits of such an infrastructure.
First, what do we mean by "secure"?
An obvious issue is the privacy of computation. With standard approaches
for agent-based systems, a malicious server has access to the complete internal
state of an agent: software agents have no hopes of keeping cryptographic keys
secret in a realistic, efficient setting. Distributed function evaluation approaches
may seem to apply, but that requires an unrealistic fault model and is not likely
to ever be practical. Approaches such as [1,2,27] are extremely expensive or
have very restricted domains.
The privacy of computation is only one aspect of the security picture: the in-
tegrity of computation is perhaps more critical. In agent-based computing, most
researchers have been concentrating on one side of the security issue: protecting
the server from potentially malicious agents. Related work in downloadable ex-
ecutable content (Java [13], Software Fault Isolation [29], Proof-Carrying Code
[24,25], OS extension mechanisms such as packet filters [21], type safe languages
[9,16], etc) all focus on this problem. The converse side of the agent security
problem, however, is largely neglected and needs to be addressed: how do we
protect agents from potentially malicious servers? Why should we beheve that
the result returned by our software agents are actually correct and have not been
tampered with?
3 Partial Solutions
How can software agents be protected from malicious servers? This is a critical
security problem to be solved if we are to have faith in agent-based computing.
In the following sections, we will examine several approaches and discuss their
limitations.
3.2 N o Protection
both the origin and the integrity of these aspects of the agent. (Message au-
thentication codes are inappropriate, since potentially malicious agent servers
should not share secrets with the originator of the agent.) Other than crypto-
graphic techniques (if any) needed for the secure communication links, for now
we will not require the servers to perform any cryptography.
It may seem that agent code signing could be circumvented by a malicious
server, since the malicious server could tamper with the agent and then re-sign
it with its own key. This approach, however, is thwarted by the following design:
agents are constrained to send its results only to the entity that signed them.
Thus, conceptually a server that re-signs an existing agent is simply performing
two actions at once: denying service to the true originator of the agent, and
sending out its own agent, possibly with initial data stolen from the "murdered"
agent.
Next, note that the originator can specify the order in which the software
agent will visit the airline servers. Abstractly, this is a circuit of the (complete)
graph connecting the airline servers, and the originator may chose this circuit at
the time of agent dispatch.
At any honest server, the agent code and its read-only state is checked when
the agent arrives, so if the malicious server tries to tamper with the agent code
or the read-only state the malicious server can not successfully pass the modi-
fied agent to an honest server. (Alternatively, the agent code and the read-only
state may be considered to be reloaded from the originator by every server.) Fur-
thermore, we assume that the variable agent state is transmitted among servers
using authenticated and encrypted channels, so that only the server that is the
intended migration target can receive the agent, as long as the agent is starting
from an honest server. Thus, the malicious server can not intercept an agent as
it migrates from an honest server to another server.
At any server, an agent may query the server's identity. At first glance, this
identity could be authenticated via a public key certification chain, with the root
certificate embedded as part of the read-only agent state. Note, however, that
the use of cryptographic authentication does not really help a software agent
to determine the hosting server's identity: since the server has control over the
agent's computation, the mahcious server may simply cause the program counter
to bypass the cryptography-based identity query and force the program to take
the conditional branch(es) which corresponds to the desired (falsified) server
identity.
In addition to being able to ask for the identity of the current server, the
agent may also ask from which server did the agent migrate. Because we assumed
that server-to-server communications use authenticated and encrypted channels,
servers will know from which server did an agent arrive. If the agent is running on
an honest server, both these answers will be correct and they can be used to verify
that the agent had migrated on an edge on the intended migration circuit; if the
agent is running on the malicious server, these answers may be incorrect and
the agent's state may be modified so that it believes it is running on a different
server. In the special case where there is exactly one malicious server, this ruse
264
will be discovered when the agent migrates off of this server to an honest server.
If there are two or more malicious servers, the first malicious server encountered
by an agent can hand it off to any of the other malicious servers in the route
the agent is programmed to take. When the agent is passed on to the next
(honest) server, the agent is brainwashed to believe it had visited all the servers
in the original path between the two malicious servers, thus avoiding discovery. / /
software agents are to depart from the route determined at agent dispatch time,
such departures must start and end at a malicious server.
Now, consider what visiting a malicious server can do to a software agent's
memory. The read-write state variables of an agent may be completely altered
by the mahcious server; thus, an agent that has just left the malicious server can
not trust any of its memory: All information collected prior to this point it time
— including data from servers visited prior to visiting the malicious server —
are suspect. Thus, only the results of computation done by those servers from
the {maximal) honest suffix of the agent's route, assuming that the computation
is independent of any input from previous servers, should be trusted.
Server Replication In [23], Minsky et. al. developed a general method for
mobile agent computation security, maxrying some ideas from the fields of fault
tolerance and cryptography. They propose that servers should be replicated,
and that replicated agents on these servers can use voting and secret shar-
ing/resplitting to move from one phase of the computation to the next.
Unfortunately, the fault model assumed in the paper is completely unreal-
istic: it assumes that replicated servers fail independently. In our Fly-By-Night
Airlines example, all replicated www.llybynight.comservers are under the ad-
ministrative control of flybynight.com, and malicious attempts to brainwash
software agents would occur on all of these servers. And while bribery of in-
dividual administrators of replicated servers by an outside adversary might
be independent events, bribery of the software engineers responsible for the
www. f l y b y n i g h t . com Web site is a much more likely scenario. Even if we assume
that Fly-By-Night Airlines is trustworthy, replicated servers in the real world are
likely to consist of identical hardware running copies of the same software: any
security holes found by an external attacker that allows him/her to compromise
one of the replicated servers is very likely to permit him/her to compromise all
the servers.
Agent Replication While the general approach proposed by Minsky et. al. fails
to be convincing, in certain special cases the fault tolerance style of approach can
solve or at least ameliorate the mobile agent security problem. Because server
replication does not help to reduce the risk of agent brainwashing, in the following
we will assume that there is only one server per administrative/security domain,
or when there are multiple servers in a domain, they are indistinguishable.
Consider the case where there is at most one malicious server in our airfare
minimization example. Assume that secure communication links exists between
the servers, and that the users possess individual certified public keys; servers
265
may use these keys to verify the origin of the software agents. (Secure commu-
nication channels may be constructed cryptographically if servers also possess
cryptographic keys to authenticated and encrypted data among the parties as
needed.) Because we are assuming that there is only one dishonest server, we
know that the agent must stay on the circuit prescribed during agent configura-
tion.
Suppose we chose some sequence of servers 5 = si, S2,... , s„. We configure
two software agents Ai and A2, where ^1 will travel along S, and A2 will travel
over 5"^ = s „ , s „ _ i , . . . ,si.
Recall that we are assuming at most one malicious server. The all-honest
servers case is trivial, so we can ignore that case; henceforth we will assume that
there is exactly one bad server. Without loss of generality, assume that server
Sj is malicious, and that Sj is run by the airline with the lowest fare {j < i).
Furthermore, we assume that the malicious server will not attempt denial-of-
service attacks — it may do so by killing the software agent or by implanting
the belief that the lowest fare is offered by some third server which will later
repudiate this idea.
First, consider the j < i case. A\ will encounter the lowest-fare server (sj)
first, and when it arrives at Sj, its memory of the lowest-fare seen-so-fax may be
altered. When A\ returns with its result, it will report either Si as the server
with the lowest fare, or some Sk where fc > i if the malicious server did not
declare a fare lower than one that the agent will see later in its travels.
A2, on the other hand, will encounter the lowest-fare server after visiting the
malicious server. It will report the correct minimum price — since we assume
no denial-of-service attacks, the corrupt server will not have made this agent
believe that a (false) lower price exists elsewhere — and when A2 returns to the
user, the user will be able to determine the true minimum airfare.
Next, consider the j = i case. When this occurs, the malicious server can
alter its price to be just below that of the second lowest price offered and still
get business. This corresponds to a Vickery auction or second-price auction,^
except the situation is upside-down: instead of the highest bidder paying the
second highest bid price to obtain the goods being auctioned off, we have the
lowest airfare offer selling tickets at the second lowest quoted price. Note that
Vickery-style price determinations may be a desirable economic design choice
anyway, since Vickery auctions are designed to maximize the flow of pricing
information so bidders have no economic interest to hedge and not bid (and
reveal) the true prices that they are willing to pay.
The above agent-replication approach provided a partial solution for a special
case — at most one malicious server — the solution did not quite work "prop-
We do not have sealed bids here since the minimization is done by the agent; aji
alternative design would be to gather bids encrypted using the pubUc key of the
agent originator, preventing servers from knowing each other's prices directly. Of
course, servers could send out their own agents to discover such "commodity" prices;
this may have to be done through anonymizing proxies if the pricing could depend
on the consumer's identity.
266
erly" to compute the true minimum airfare: when j = j , we could only achieve
second-best pricing, where what we obtain is the second-best airfare minus e.
Arguably, since airline servers may also send out agents to determine pricing at
other airlines — assuming price information can be obtained anonymously or
in such a way that we are assured that it is independent of consumer identity
(or race or age or ...) - Vickery pricing may be the end effect whenever there is
great consumer price sensitivity in any case. Applying some basic cryptographic
techniques, however, we can do a httle better.
in querying the airline's databases prior to finding this result, and we don't need
to prove that it took place and that it ran correctly).
where at each server s, a partial result PRi is computed using key fcj, and
servers si,.. .Sj are honest and do not expose the internal state of the agent,
then 'ii,k : i < j < k,Sk can not forge PRj, since Sk. must know k, to change
PRi.
Publicly Verifiable PRACs The MAC-based PRACs above required that the
agent originator maintained a secret key or keys in order to detect tampering
with the partial results. An obvious question is whether forward integrity can
be provided such that the integrity verification may be public — so that an
untrusted intermediate server not sharing the secret key with the originator
may nonetheless help detect tampering.
Like MAC-based PRACs, publicly verifiable PRACs are implemented by rely-
ing on the destruction of information when agents migrate. Here, we use a digital
signature system: when the agent is dispatched, it is given a list of secret sig-
nature functions sigi(m),... , sig„(m), along with usre-generated certificates for
their corresponding verification functions verifi(m, s ) , . . . , verif„(m, s). The ver-
ification functions would be signed by the user's signature function siguser('^).^
Like simple MAC-based PRACs, we use sigj(m) to sign the partial result com-
puted on server Sj, and erase sigj(m) prior to migrating to server st+i.
Similar to one-way function MAC-based PRACs, we can also defer key gen-
eration, so that most of it is done on the servers, which presumably have greater
resources. Here, the agent is given an initial secret signature function sigi(m)
and a certified verification predicate verifi(m, s); the signature function is used
both to sign partial results and to certify new verification functions.
* Signing a function verifi(.) simply consists of signing the parameters that specify the
function; in RSA, it would be the two values e,;,ni.
269
The verification predicate verifj (m,s) (and its certificate) is public, and the
signature function sigi(m) is secret. When the agent is ready to leave server si,
it signs the partial result ri by computing sigi(ri). Next, it chooses (randomly)
a new signature / verification function pair from the signature system, sig2(m)
and verif2(m, s), and computes sigi (verifa) to certify the new signature functions.
Lastly, before the agent migrates to server S2, sigj is destroyed.
To use pubhcly verifiable PRACs, the list of certified verification predicates
must be either published and/or carried with the agent. When these predicates
are available with the agent, publicly verifiable PRACs enjoy an important prop-
erty not available with MAC-based PRACs: while at server Sj, the agent can
itself verify the partial results obtained while at servers Si, where i < j . In par-
ticular, this means that computations that depend on previous partial results
can detect any integrity violation of those results — the agent's computation can
abort early, instead of having to finish the computation and detecting integrity
violation only when the agent results return to the agent originator.
Proof Verification We would like to get a guarantee that the agent's com-
putation was done according to program specified in the agent. One possibility
is to forward the entire execution trace — or a signature of it [28] — to the
originator, who checks the trace if there is cause for suspicion. This however is
too costly. Furthermore, if only a signature is sent to commit to a trace, it is
unclear how to decide when to actually request the entire trace and then check
it. We would like to explore the use of holographic proof checking techniques [3].
This is quite a speculative idea. The current approaches are very theoreti-
cal. In principle they do help, but the cost in practice of existing solutions is
prohibitive. We are considering investigating ways to use the ideas in a more
practical way. Let us describe the ideas and issues to see what it is about.
Call the program x. Let y denote an execution trace. Define the predicate
p{x, y) to be 1 if this trace is correct (corresponding to running x) and 0 other-
wise. The server does not want to send y. But it can encode y as a, holographic
proof y'. This has the property that one needs to look at only a few bits of
y' to check that p{x,y) = 1. It is tempting from this to think that the server
can just transmit a few bits. But this does not work. The model necessary for
holographic proofs is that the verifier have available a fixed, "committed" proof
string y' that he can access at will. He will pick a few random positions here and
check something. So there is no choice but to transmit y' in entirety. We will not
save bandwidth. We will gain something: the verification process is faster. (The
verifier receiving y' will perform some quick spot-checks).
A better approach is to use computationally sound (CS) proofs as in [20,22].
Having constructed the holographic proof y' as above, the server hashes it down
via a tree hashing scheme using a collision-resistant hash function h. Only the
root of the tree is sent to the originator. This is relatively short, so bandwidth
is saved. In addition, certain challenges are implicitly specified by applying an
ideal hash function to this root, and the server also provides answers to them.
The total communication from server to originator is still small compared to the
270
length of the original execution trace y, yet some confidence in the correctness
of y is transmitted!
The tree hashing is actually not impractical. What is prohibitive is con-
structing the holographic proof y' to which it is applied. This currently calls for
apphcation of NP-completeness techniques, including the use of the construc-
tion underlying Cook's theorem. What we might hope instead is to find a direct
holographic proof for the functions of interest, and then apply tree-hashing.
4 Trust IVEodels
The issue of trust models is very important to agent-based computing. Agents do
not just need a trusted computing base (TCB) — trust may not be so binary in
nature. Instead, agents (or their deployers) may decide that it is okay to run in a
software-only environment if such an environment is hosted by a well-known and
trusted entity, but the use of physical protection to maintain the trustworthiness
of a trusted third-party provided execution environment is needed when the
environment is hosted by an entity with no reputation to protect and/or where
no legal remedies may be obtained.
* The Internet Engineering Task Force's Transport Level Security group have devel-
oped a merged protocol based on SSL version 3 and features from PCT. These
protocols require a merchant-side public key infrastructure.
271
In Sanctuary, we envision that the trust decision will be made by the agent's
software itself. Thus, trust specification is simply an object in Java, and any
effectively computable function may be used. This is similar in spirit with the
work of Blaze and Feigenbaum [10], except that by unifying the agent language
and the trust specification language, the programmer's work is simplified.
5 Mobile Java
6 Future Work
The Sanctuary project group are examining important security issues in mo-
bile agent computing. This paper has discussed some preliminary results and
directions.
The primary goal is to build a secure agent environment insofar as it is the-
oretically feasible, including issues such as providing clean abstractions to make
programming errors less likely as well as good cryptographic support. Currently,
we are building a trusted Java agent environment to run within a secure copro-
cessor and designing APIs that permit agents to exist both in a hardware-based
secure environment and in a software-only environment unchanged (but permit-
ting security property queries). Next, we will build the necessary software tools to
permit Java-based agents to be mobile. Our techniques, once implemented, will
enable these agents to run on unmodified Java interpreters; this design approach
permits greater acceptance of our work, since no complex installation process
will be required, and it will allow our system to track new Java releases more
easily. Additionally, we are examining alternative methods for providing security
for software agents through fault tolerance and cryptographic approaches (e.g.,
distributed function evaluation, additional uses of digital signature techniques,
etc).
Acknowledgement s
The author wishes to thank Mihir Bellare for his invaluable help in preparing
this paper.
This research was funded in part by a National Science Foundation CAREER
Award (CCR-9734243), a Faculty Development Award from National Semicon-
ductor Corporation, and a gift from the Powell Foundataion.
272
References
1. Martin Abadi and Joan Feigenbaum. Secure circuit evaluation. Journal of Cryp-
tography, 2(1):1-12, 1990.
2. Martin Abadi, Joan Feigenbaum, and Joe Kilian. On hiding information from an
oracle. Journal of Computer and System Science, 39(l):21-50, August 1989.
3. Laszlo Babai, Lance Fortnow, Leonid A. Levin, and Mario Szegedy. Checking
computations in polylogarithmic time. In Proceedings of the Twenty Third Annual
ACM Symposium on Theory of Computing, pages 21-31, New Orleans, Louisiana,
May 1991.
4. Mihir Bellare, Ran Canetti, and Hugo Krawczyk. Keying hash functions for mes-
sage authentication. In Neil Koblitz, editor, Advances in Cryptology — Crypto '96,
volume 1109 of Lecture Notes in Computer Science. Springer-Verlag, 1996.
5. Mihir Bellare, Roch Guerin, and Phillip Rogaway. XOR MACs: New methods
for message authentication using finite pseudo-random functions. In Advances in
Cryptology — Crypto '95, volume 963 of Lecture Notes in Computer Science, pages
15-28. Springer-Verlag, 1995.
6. Mihir Bellaxe and Bennet Yee. Forward-secure cryptography: How to protect
against key exposure. Work in progress.
7. Mihir Bellare and Bennet Yee. Forward integrity for secure audit logs. Technical
report, Computer Science and Engineering Department, University of California
at San Diego, November 1997.
8. Josh Benaloh, Butler Lampson, Terence Spies, Dan Simon, and Bennet S. Yee.
The PCT protocol, October 1995.
9. Brian N. Bershad, Stefan Savage, Przemyslaw Pardyak, Emin Giin Sirer, Marc E.
Fiuczynski, David Becker, Craig Chambers, and Susan Eggers. Extensibility, safety
and performance in the spin operating system. In Proceedings of the Fifteenth
Symposium on Operating Systems Principles, December 1995.
10. Matt Blaze, Joan Feigenbaum, and Jack Lacy. Decentralized trust management.
In Proceedings 1996 IEEE Symposium on Security and Privacy, May 1996.
11. Tim Dierks and Christopher Allen. The TLS protocol, version 1.0,
November 1998. Internet Engineering Task Force Internet Draft; see
http://www.ietf.cnri.reston.va.us/internet-drafts/draift-ietf-tls-protocol-06.txt.
12. Alan Preier, Philip Karlton, and Paul Kocher. The SSL protocol version 3, De-
cember 1995.
13. J. Steven Pritzinger and Marianne Mueller. Java security, 1996. Published as
http://www.javasoft.com/security/whitepaper.ps.
14. Stuart Haber and W. Scott Stornetta. How to time-stamp a digital document.
Journal of Cryptology, 3(2), 1991.
15. Matthew Hohlfeld and Bennet S. Yee. How to migrate agents. Techniccd Report
CS98-588, Computer Science and Engineering Department, University of California
at San Diego, La Jolla, CA, August 1998.
16. Wilson C. Hsieh, Marc E. Fiuczynski, Charles Garrett, Stefan Savage, David
Becker, and Brian N. Bershad. Language support for extensible operating sys-
tems. In Proceedings of the Workshop on Compiler Support for System Software,
February 1996.
17. IBM Corporation. Common Cryptographic Architecture: Cryptographic Application
Programming Interface Reference, SC40-1675-1 edition.
18. IBM 4758 PCI cryptographic coprocessor. Press Release, August 1997.
http://www.ibm.com/Security/cryptocards/.
273
Volker Roth
1 Introduction
Mobile agents are bundles of program and state, that move within a
network to perform tasks on behalf of their owners. The benefits offered
by mobile agents unfold in areas where it is advantageous to move the
computation process over a network to the source of data instead vice
versa; for instance if huge amounts of data must be processed, or the
network is slow, expensive, or is not permanently available. Among the
manifold uses for mobile agents, electronic commerce applications are
noted most frequently. With good reason - electronic commerce provides
perfect grounds to illustrate the benefits of mobile agents as well as the
threats that keep us from using them in open networks, so far.
One example benefit of a mobile agent is its ability to carry out certain
tedious or time-consuming tasks autonomously while its owner is offline.
The goal of, for instance, a shopping agent might be finding the best
offer for a given product description, thus optimising the benefit to its
owner (which is legitimate). However, shops might be tempted to opti-
mise for their own benefit, even if it means optimising at the expense of
the agent in unfair ways (which is not legitimate). This includes manipu-
lation of offers previously collected by the agent, as well as abusing agents
as mediators of attacks on competitors. Hence, neither party trusts the
other, and both share their distrust in the interconnecting network. As a
276
consequence, the security requirements for mobile agent systems are man-
ifold as well as demanding. Quoting from [2]: "/it is difficult to exaggerate
the value and importance of security in an itinerant agent environment.
While the availability of strong security features would not make itiner-
ant agents immediately appealing, the absence of security would certainly
make itinerant agents very unattractive."
In general, the problem of malicious hosts is considered particularly chal-
lenging. The threats imposed by malicious hosts are immanent in the
way mobile agent systems are built. Since agents are executed on their
host, each instruction in the agent's code is observed by the controlling
(virtual) machine which also maintaines the agent's state. This causes a
number of security concerns of which the most prominent ones are:
2 Co—operating Agents
The percentage of malicious hosts likely depends on the gain which can
be expected from successfully attacking mobile agents weighted against
278
the costs of mounting the attack as well as the risk of detection and the
consequences of being detected. Collaborations of multiple hosts on an
agent's itinerary yields more power but it also requires close coordina-
tion and increases the danger of leaks which might lead to disclosure of
the collaboration. Whether an agent will be attacked by a single mali-
cious host or a collaboration of hosts on its itinerary depends on a sea
of unpredictable parameters that are unique for each instantiation of an
agent.
Still, we would like to model partitions of hosts based on their willingness
to collaborate with regard to attacking a fixed agent. For this reason, let
'H be the set of hosts interconnected by a network. For a given instan-
tiation of an agent let 7?. be a relation defined as TZ C y. x y. with the
interpretation (h,:,hj) & TZ <^ hi and hj collaborate in attacking the agent.
Let Ha,Hb be non-empty subsets of Ti with (HQ X H;,) n 7?. = 0. These
two sets are denoted non-colluding. A special host is the first host (the
origin) of an agent, since this needs to be a trusted host.
Two co-operating agents are defined to be agents a and b such that the
itinerary of a includes only hosts in H^ and ft's itinerary only includes
hosts in Hf,. Let ha and h^ be the computing environments currently
executing agents a and b respectively. Occasionally, we will say that ha
is "the host of" agent a, or "ha is agent a's host".
Although agent a might be attacked by host ha, by definition this host
may not attack the co-operating agent b without breaking into host h^.
This can be exploited to design protocols for securing agents against at-
tacks by single malicious hosts as well as hosts that collaborate with other
hosts as long as the collaboration does not span the itineraries of both
agents.
A number of strategies can be used as starting points for developing said
protocols:
Yee already pointed out that "if an agent is running on an honest server,
both these answers (for the peer identity and the local host's identity)
will be correct..." [15]. We assume that a host is honest unless it may
successfully attack an agent on its own, or with the help of other hosts
on this agent's itinerary. In other words, we assume that hosts do not
randomly introduce lies.
A simple yet effective attack on a mobile agent is to not let the agent
migrate to the servers of competitors. This particularly affects mobile
agents with loose itineraries in comparison to agents whose itineraries
are defined a-priori, because deviations from a fixed itinerary are easier
to spot and prove.
Let a and b be two co-operating agents, and let B.afibQ'H be two non-
colluding sets of hosts. Both agents shall return to their origin upon com-
pletition of their tasks. Each agent b records and verifies the route of its
co-operating agent a as described below.
is not able to decide which one of the two hosts is the culprit. However, if
hi+i really received agent a from h' then hj+i should be able to produce
a copy of a which is signed by h' given some additional agent protection
mechanisms are implemented (see [5]).
If agent a is killed, one of two hosts might be responsible and the protocol
cannot decide which one. In addition to that, some host hi-).i might take
two agents ai and a2 both being received by the same host hj and switch
the recording of the route of ai to agent 62 and vice versa. Therefore,
co-operating agents should also exchange and verify (unique) identity
information that is bound to the agent's static part by their owner's
signature (see [5]). In that case, attempts to send fake ids are detected
on the first honest host. The protocol must be enhanced accordingly.
6 Acknowledgements
I would like to thank the reviewers for their extensive comments on the
initial manuscript, which were most valuable for improving the quality of
this paper.
References
1. BLAZE, M . , FEIGENBAUM, J., IOANNIDIS, J., AND KEROMYTIS, A. he role of trust
management in distributed systems security. In Secure Internet Programming [14].
2. C H E S S , D . , GROSOF, B . , HARRISON, C , LEVINE, D . , PARRIS, C , AND TSUDIK, G .
Itinerant agents for mobile computing. IEEE Personal Comm,unications (October
1995), 34-49.
3. K A R J O T H , G . , ASOKAN, N . , AND GULCU, C . Protecting the computation results
of free-roaming agents. In Mobile Agents (MA '98), vol. 1477 of Lecture Notes in
Computer Science. Springer Verlag, Berhn Heidelberg, September 1998, pp. 1-14.
4. RIORDAN, J., AND SCHNEIER, B . Environmental key generation towards clueless
agents. In Mobile Agents and Security [13], pp. 15-24.
5. R O T H , V., AND JALALI, M . Access control and key management for mobile agents.
Computers & Graphics 22, 3 (1998). Special issue Data Security in Image Com-
munication and Networks.
6. SANDER, T . , AND TSCHUDIN, C . F . Protecting mobile agents against malicious
hosts. In Mobile Agents and Security [13], pp. 44-60.
7. SCHNEIER, B . Applied Cryptography, 1 ed. John Wiley &i Sons, Inc., 1994, section
6.7, pp. 120-122. Digital Cash Protocol # 4 .
8. SHIMSHON BERKOVITS, JOSHUA D . G U T T M A N , V. S. Authentication for mobile
agents. In Mobile Agents and Security [13], pp. 114-136.
285
Implementations
Access Control in Configurable Systems
Trent Jaeger
1 Introduction
A hope afforded by the mass connectivity of the Internet is that it would enable
users to engage in a variety of new applications regardless of their geographical
distance. Thusfar, the most successful applications to be enabled by the Inter-
net are client-server appUcations supported by the World-Wide Web. In these
applications, clients simply request information from servers. Servers may use
complex databases or dynamically determine the interface for processing the
request, but almost all the computing is done at the server.
While a number of useful applications can be developed using the client-
server paradigm, other computing approaches may be more appropriate for some
applications. This is particularly true for applications where the movement of
the computation is preferred over the movement of the data. In collaborative
applications, often the computation is smaller than the application state, so
each collaborator stores the application state and communicates their actions to
the group [16,26]. In this peer-to-peer approach, each collaborator may execute
an action performed by any member of the group on their own copy of the
shared state. Collaborative applications must be able to restrict the permissions
of such actions based on the state of the application. For example, only the files
currently being shared should be accessible to participants using a shared editor.
Second, research is underway to investigate how the modularity and exten-
sibility of operating systems can be improved. A number of systems have been
devised that can extend the operating system dynamically using kernel exten-
sions [4,12,42]. Also, the OSKit project is investigating how operating systems
290
derive a enforcement mechanisms that can support flexible and efficient enforce-
ment of access control policy. In Section 5, we describe the representations and
mechanisms chosen for the Lava Security Architecture.
2 Security Requirements
We assume a computing model in which principals (e.g., users and services)
consume resources (e.g., CPU and memory) to perform operations (e.g., read
and write) on objects (e.g., files and channels). The abihty to use resources to
perform operations on objects are called the access rights of a principal. The
fundamental problem posed by the computing approaches described above is to
restrict the access rights of each dynamically downloaded module consistently
with its service or application requirements. Since these requirements may vary
with each use, the job of access control policy specification and management
is daunting. Also, authorization mechanisms must be effectively and efficiently
able to enforce these requirements. In this section, we specify the access control
policy and authorization mechanism requirements for configurable systems.
Remote Servers
Application j
Process
Applications
Module 1
Module 2
Service i
Process —'
Services Module 3
Module 2
Fig. 1. System Execution Model: Services and applications cire composed of pro-
cesses containing one or more dynamically-loaded modules
— Plant viruses and Trojan horses: Add binaries to the host's system and
modify the host's environment, so these malicious binaries are run by more
privileged users
The first four types of attacks are what access control infrastructure is typ-
ically designed to prevent. For example, the second type of attack can occur if
users do not adequately control discretionary access to their own executables
(e.g., a program under development). A module could identify that a file is a bi-
nary executable, and infect the program with a virus or replace it with a Trojan
horse.
The fifth type of attack is a denial-of-service of system resources. For example,
a module that writes to the disk until it is full would deny other modules from
writing to the disk. Operating systems typically do not prevent these attacks
from occurring. However, it has been shown that proper management of system
resources can enable control of the amount of service resources that a module
may consume [30]. However, these mechanisms have not been demonstrated in
practice, and it is non-trivial to determine reasonable resource limits.
In the sixth type of attack, the module leaves a trap that an unsuspecting
user may be lured into. For example, a Trojan horse program may be written
into a user's directory ostensibly used for application files, and the untrusted
module may modify the path environment (e.g., due to lax user administration
of their configuration files) resulting in the execution of the planted program
with the user's full rights.
293
The security requirements described above restrict the actions that a mod-
ule may perform on any system or application objects. Fundamental to these
requirements is the ability to create and name objects. In order for the system's
access control policy to be enforced on dynamically composed services and appli-
cations, it must be possible for this policy to be mapped to dynamically-created
object name spaces. These object names must be immutable to prevent renaming
after authorization (time-of-check-to-time-of-use or TOCTTOU attacks). Then,
basic access control on each operation on each object can prevent unauthorized
operations from being performed. In addition, control of resource consumption
may be desired to control some denial-of-service attacks. Next, it may be nec-
essary to control the delegation of rights by dynamically loaded modules. For
example, an application developer may not know that the delegation of a right
may lead to the creation of an unauthorized communication channel. Lastly,
this policy for controlling this delegation is determined outside the module, so it
must be possible for the system to enforce its policy as it changes (e.g., due to a
change in application state). Even though the system cannot prevent a principal
who possesses a right (in this case, the application) from using it to circumvent
security policy, it is often the case that the application wants to enforce security
policy properly, but it cannot be completely aware of that policy. The system
must be able to enforce its policy under these trust conditions.
294
Access control models must enable the specification of the desired security
pohcies with a reasonable amount of effort. The key issue is that policies may
be dependent on dynamic factors, such as application state. For example, the
patient data available to a nursing application depends on the hospital, ward,
etc. in which the nurse principal works. In these cases, the access control model
must enable the association of application context and principals. Because policy
may be context-specific, it may be desirable for both system administrators and
other principals, such as application designers, to specify the rights available to a
module. Thus, the access control model must enable the specification of policies
by multiple principals, but these principals may not be completely trusted to
administer rights, so permission management must be restricted based on a
mandatory access control policy.
Once the security policy has been specified, the system must be able to enforce
it. The system must be able to derive legal permissions for its principals and
authorize their actions using these permissions.
A model for the derivation of a module's legal permissions is needed. As
described above, this model must support the abihty for multiple principals to
delegate a limited set of rights. The set of rights that may be delegated may
depend on the current state of the system or application. Therefore, there must
a mechanism for keeping the access control policy of a module consistent with
the factors it depends on (e.g., the state of the application in which it is being
run). A problem is that such management itself can become ad hoc. The model
needs a means for expressing when a modules permissions should be expanded
or retracted within the mandatory limits.
Once the a module's permissions are determined, the authorization mecha-
nism must be able to enforce these permissions. In general, authorization mech-
anisms must perform the following tasks [1]:
While the basic functions of the security mechanism remain the same as dy-
namically downloaded modules are introduced, the complexity of these tasks is
increased. First, modules may be loaded into the same protection domain (i.e.,
address space) to improve performance. However, operations within a protection
domain cannot be intercepted, so it is necessary for the security mechanisms to
determine whether such a load is permissible. Next, a dynamically downloaded
module may provide access to a resource that requires access control. For ex-
ample, the file system may be a dynamically downloaded module. However, this
module must be able to enforce system security constraints on the accesses to
its objects. Therefore, the system security mechanisms and the module need to
work together to enforce the system's access control policies properly.
295
Many systems associate principals directly with users or services. A user executes
a program, and the program assumes the rights of the user principal. Unfortu-
nately, when a user executes a downloaded module, the module cannot assume
the full rights of the user without opening the user to the attacks described
above.
In general, the permissions associated with a user executing a downloaded
module would be those of that user assuming a role of executing that module. For
example, Jaeger and Prakash show that the security policy in many collaborative
situations is the intersection of the permissions of the collaborators [22]. In these
cases, a user's permissions are those of the user acting on behalf of a collaborator.
This solution is specific to collaborative systems, but a more general approach
has been known for many years: role-based access control (RBAC) [41]. In a basic
RBAC model, called RBAC96 [38], the users are associated with the roles that
they may assume. Permissions are then associated with the roles rather than
the users. This enables users to execute programs with different (hopefully, least
privilege [37]) access rights.
Many RBAC models utilize inheritance to ease the specification of access
rights. For example, users may assume application-specific roles which have a
subset of their rights as shown in Figure 2. In general, a role hierarchy represents
a set-subset relation where the rights are inherited up the tree (which is the
opposite of object-oriented inheritance). Thus, any descendant role has a subset
of its ancestor's permissions.
Thus, roles can be created with the proper rights for each module, and users
can switch to the appropriate role when they want to execute a module. In
addition, RBAC models also prescribe the use of constraints to enforce more
complex security requirements, such as separation of duty. For example, the roles
of two modules can be evaluated to show that they share no access to common
objects. Also, Chinese Wall requirements can be enforced to prevent a user from
executing two mutually exclusive sets of privileges simultaneously [39]. Even
denial-of-service requirements, such as memory usage limits, can be expressed
as constraints on the use of permissions.
296
Root Role
RlooT U PA U ft
U R; UR) U PE
Role A RoleB
PA U B: U R] ft u R) u Pt
Fig. 2. Role Hierarchy: Note that permissions are iniierited up the role hierarchy
3.2 Aggregations
As we saw, RBAC models enable users to be aggregated into logical roles and
permissions to be aggregated for use by those roles. However, access control mod-
els can support a number of other aggregations that may ease the specification
of security policy. We examine the aggregations of objects and operations. Then,
we propose an extended RBAC model in which these aggregations are included
and assess what this extended model enables.
A number of systems enable objects with the same rights to be aggregated [19,
24,7,2,49]. In the CORBA security model [19,25], objects are aggregated into
sets called domains. Permissions are specified per domain, and any principal
can be given rights to access objects in a domain. That is, the assumption is
that different principals have common object aggregations for specifying their
rights. Of course, this may not always be the CclSGj SlS different principals may
297
Constraints
Users
Object Operation
Groups Groups
Objects ; Operations
use their own objects to execute may have common rights over different object
aggregations. In this case, multiple domains must be created.
Next, the operations of a specific type can be aggregated into a set of opera-
tions named by a type with common access control requirements. For example,
certain operations may be designated as read operations and others as write
operations for particular objects. This enables system administrators to specify
the rights of principals to read and/or modify data given a definition of which
operations perform those types of operations. In our work [23], application de-
velopers aggregate operations into operation groups, and system administrators
define rights in terms of those groups. Similarly, CORBA enables the definition
of interfaces, and administrators define the required rights needed to access those
interfaces. For example, a principal needs read permission on a domain to access
their read interface.
Therefore, the RBAC model presented in the previous subsection can be ex-
tended as shown in Figure 3. In this RBAC model, objects for which common
rights may be expressed are aggregated into object groups. Operations which cor-
respond to common operation types can be aggregated into operation groups.
Then, the set of permissions is a relation between the sets of object groups and
operation groups. Further, object groups and operation groups can themselves
be aggregated, and these aggregations may be subject to constraints. For ex-
ample, an object group can be created that contains the intersection of two
object groups' members. Also, permissions can be constrained based on their
constituent object and operation groups. For example, this would enable the
specification of the intersection of two permission sets.
An obvious benefit of this extended RBAC model is that the task of express-
ing permissions is eased because it is possible to express permissions using fewer
statements. An additional, less obvious benefit is that it enables a separation
of labor between users/developers and system administrators. Because system
administrators often do not know the semantics of application objects and op-
298
erations, they cannot effectively express rights for them. Instead, apphcation
developers may create object groups and operation groups that map to semanti-
cally meaningful system administrator notions. While this may seem dangerous,
it is generally not unreasonable for modules to define the access rights of others
to their objects.
Unfortunately, this does not solve all our problems and creates some new
ones. Object and operation aggregations do not provide the flexibility needed in
some cases because they are static for all execution instances of an application.
For example, in a collaborative context, only the objects in use in that con-
text should be accessible to the collaborators. Therefore, an object aggregation
should depend on the application state. Also, a new problem is created because
application developers and users can assign rights to other principals. Every right
that is delegated to a principal should be within the system's security policy.
This does not preclude us from establishing that modules can enforce security
policy on their objects - we must trust them to provide proper access to the ob-
jects they serve. However, module writers cannot be trusted to know the system
security policy, so they cannot be allowed to delegate permissions arbitrarily.
Constraints, such as separation of duty, may be violated. In the following two
subsections, we address these two problems.
Fig. 4. Transform Limits: A principal's permissions are at most the union of the
permissions that may be delegated by principals 1,2, and 3.
3.4 Administration
RBAC models have also been designed that support the administration of se-
curity policies. In the ARBAC 97 model [40], an administrative role hierar-
chy defines the administrative roles and their rights to modify the information
in the basic role hierarchy: user-role assignment, permission-role assignment,
and role-role assignment (i.e., construction of the role hierarchy itself). Based
on the extensions described above, ARBAC 97 could likewise be extended to
enforce object-object group assignment, operation-operation group assignment,
and object-operation assignment (i.e., creation of permissions).
300
3.5 Constraints
The use of constraints in RBAC is the least-developed notion in RBAC models.
Currently, research has identified two sub-satisfactory notions of constraints. In
some systems, a general notion of constraints is included in the model, as is done
in ARBAC 97 or NAPOLEAN, that does not include a language for specifying
such constraints. In other systems, specific, limited constraints are supported.
However, these constraint languages are not general purpose.
A security constraint language must enable enforcement of the variety of
security constraints. For example, it must be possible to specify restrictions
301
on the rights a principal may attain, that principals cannot modify executables,
separation of duty between principals, Chinese Wall constraints, denial-of-service
constraints, etc. However, some constraints are distinctly different, so entirely
different constraint models may be appropriate. For example, enforcement of
the Bell-LaPadula security policy is much different than the basic access control
policy outlined here. While it has been shown that general RBAC models can
enforce Bell-LaPadula [3], it is not clear whether a general constraint language
should be exposed to the administrators who write the policies.
In our work, we take a two-pronged approach. First, we enable the speci-
fication of transform limits. A transform limit specifies the set of permissions
that a delegatee can obtain from a specific delegator. The set of transform limits
determine the maximal set of permissions that a principal can obtain. Second,
the definition of such transform limits depend higher-level security constraints,
such as Chinese Wall [8]. A Chinese Wall policy is a dynamic separation of duty
policy in which only one of several disjoint sets of rights may be used depending
on the actions of the principal. Constraints are written that determine the trans-
form limits that may be active. The language for expressing such constraints is
still under development.
Bertino et al propose a language for defining constraints on the assignment
of users to roles and the creation of inheritance role hierarchies [5]. Their ap-
proach is designed to enforce constraints on the execution of workfiow tasks. For
example, a user may be restricted from executing a second task if he executes a
previous task in a particular role. Therefore, constraints are specified in terms
of tasks. Constraints specific to operations within a task cannot be expressed,
however.
Our opinion is that a single general constraint language should be devised.
This language should be general enough to express any constraint (on users, ob-
jects, operations, and their administration). However, such a language will likely
be too abstract and complex for use by system administrators, so application
specification languages must be built on this base constraint language for the
individual applications.
r •\
>
Module Process
Security Manager
Transforms Using
Transform Principal 1
Limits ^^.^r Permissions
Operation
Principal 2
Permissions
\ ^.
Principal 3
Permissions
the thread many rights that are not necessary given the appHcation in which the
class is used, the purpose of the class in that application, and the application's
current state.
Instead, we advocate an approach where a principal's permissions are always
consistent with the current state of the application or service. Using our access
control model, these delegations are restricted by the transform limits of the
principal. The problem is to automate the process of delegation. For this purpose,
we define the concept of a transform which associates an operation with a change
in permissions of one or more principals [23] (see Figure 5). Thus, changes in
application state can be correlated to changes in permissions. These permissions
changes are authorized using the transform limits of the delegatees.
Once a principal obtains a permission, a capability can be obtained for it. For
example, possession of a permission to read and write a file enables the principal
to obtain a capability with those operations activated. When a transform revokes
a principal's permission, the system may need to revoke some of its capabilities.
However, multiple principals may grant the same permission, so any effected
capabilities are only invalidated. They are authorized again upon the next use.
kernel uses its internal representation of the system name space and its internal
representation of the system security policy to authorize system calls.
There are three problems with this approach for systems configured from
modules dynamically: (1) the name spaces of objects that need to be controlled
may be outside the control of the kernel; (2) the kernel does not aid the applica-
tions in enforcing their security requirements; and (3) the access control policy
flexibility of our model increases the kernel's complexity. First, modules manage
their own name spaces, but these name spaces must be predictable, so system
and application security policies can be enforced. Second, downloaded modules
may want the system's trusted computing base to help it enforce its access con-
trol requirements. For example, an application may use another module, but this
module should only have a subset of the application's rights. Lastly, the addition
of more flexibility to security policy means that the authorization mechanisms
become more complex. In a configurable system, inter-process communication
(IPC) must be very fast to support a reasonably fine granularity of processes,
but the addition of a complex authorization mechanism into each IPC would
add a mandatory cost to all IPCs (even those with few requirements to check).
Micro-kernel systems (e.g., Mach [36] and Spring [13]) avoid the name space
problem by only controlling communication between processes. For a client pro-
cess to invoke an operation on a server process, the client must have a capability
to send a message to a port to which that server has a receive capability. The
semantics of the operation are determined by the server and authorized by the
server in its own way. Also, capabilities can be passed freely in messages, so
no control on delegation is enforced. While this removes the complexity of the
name space from the kernel, it prevents the kernel from being able to enforce
the system's current security policy and requires each server to define its own
authorization mechanism which may result in errors.
Reference monitors enable control of delegation. For example, the Distributed
Trusted Operating System (DTOS) extends the Mach kernel by a mechanism
that enables custom security policies to be enforced by security servers [33].
DTOS adds a capability cache to the kernel, so servers can call the kernel to
authorize operations. If a capability is not present, a security fault is taken which
results in a call to the appropriate security server. Security servers interpret the
server's object name spaces. The administrators of the service that use this se-
curity server must know enough about the object name spaces of these servers to
enforce the policy. Upon a successful authorization, the security server then up-
dates the capability cache. Also, security servers can invalidate the entries in the
capability cache which enables revocation. DTOS provides a flexible authoriza-
tion mechanism, but it makes the kernel more complex (which has been found
to degrade IPC performance [29]) and requires extra communication overhead
to authorize the operation (1 round-trip IPC).
Our goal is to combine the kernel cache and security servers into one entity
outside the kernel. The Clans & Chiefs mechanism is not explicitly an authoriza-
tion mechanism, but it can be applied to authorization [28]. In this mechanism,
a chief is a special process in a clan (i.e., set of processes). Any IPC between
305
two processes in the same clan are forwarded directly to the destination by the
kernel. However, any IPC either to a process outside the clan or from a process
outside the clan is automatically redirected by the kernel to the chief. In a secu-
rity scenario, the chief can authorize any communication between processes in
its clan and the other processes in the system. In an early version of the Lava
Security Architecture [21], we use a chief for every dynamically loaded module,
so we could control its accesses to the trusted servers and other dynamically
downloaded modules. The chief stores the access control policy of the process
that it controls and authorizes all of this process's operations. Of course, if an
IPC is between two different clans, then three system IPCs (client-chief, chief-
chief, chief-server) are required for the communication to be complete. While we
showed that the base cost could be as low as 4 /xs and that we measured 9 fis,
the extra IPC (chief-chief) performed no useful operations.
The problems with the Clans & Chiefs mechanism are: (1) that every process
in a clan must be able to freely communicate with every other process in the
clan and (2) at least two chiefs must be invoked for every inter-clan IPC. In
order for two processes to belong to the same clan they must be able to freely
communicate, freely distribute capabilities, and avoid revocation of the other's
capabilities for their entire execution. To interpose a chief requires that the pro-
cesses be deleted and re-created in the desired clans (i.e., clan relationships are
static). Therefore, most of our clans have degenerated to a single clan member.
Also, it has been shown that a single reference monitor is capable of enforcing
very complex security requirements. The need to go through multiple chiefs, by
default, is too expensive.
Therefore, we evolve the Clans & Chiefs mechanism to better satisfy these
goals. In general, the notion of the IPC redirection is powerful: an IPC can either
be sent to the destination process or redirected to another process. Therefore, we
endow the kernel with the ability to maintain a mapping between destinations
and redirections for each process [20]. For example, consider Figure 6. When the
source sends an IPC to a destination, the kernel is invoked. The kernel examines
the redirection cache to determine if there exists an entry for the destination. For
example, a particular reference monitor may be assigned to this communication
channel. In this case, a reference monitor receives the redirected IPC and can
authorize it before sending it to the destination.
An effective reference monitor must be able to mediate all communication,
but how the mediation is done can be controlled flexibly using this mechanism.
In general, a reference monitor between a particular source and destination ver-
ifies that the source has the ability to invoke the requested operation on the
destination (i.e., within the source's current permissions). Thus, the system's
security policy can be enforced on each IPC. If the destination is trusted to
enforce the system's security policy on a particular source, then the reference
monitor may be removed from that communication channel (i.e., the redirec-
tion is changed from the reference monitor to the server). However, if the access
control policy should change, then the IPC redirection mechanism enables the
306
Source Destination
Redirection
Kernel
Redirection Cache
Destinations
Lava Nucleus
Redirection
Cache
and an authorization library (see Figure 7). The Lava nucleus provides operat-
ing system primitives upon which all applications and services are constructed:
multi-threaded tasks, address spaces, and inter-process communication (IPC).
The nucleus uses the system's hardware protection to separate individual tasks
and IPC redirection to control communication. The SAI derives the access con-
trol policy for each module and loads the modules into tasks in such a way that
this policy can be enforced. The SAI assigns reference monitors to enforce a
task's access control policy. A reference monitor can mediate all IPCs on the
communication links it has been assigned, so it can restrict communication be-
tween any two tasks, restrict delegation of permissions, restrict the operations its
tasks can perform, and revoke permissions if the access control policy changes.
The transform library supports the execution of transforms upon module oper-
ations, so that the access control policy can be maintained consistently with the
application's state. The authorization library is provided to servers, so they may
enforce system security policy on their objects without the need to build ad hoc
security infrastructures.
The access control policy data must be distributed carefully between the
servers and reference monitors to enable flexible and secure authorization as
shown in Figure 8. Each reference monitor maintains its task's permissions.
Permissions are stored per server in server permission tables. The reasons storing
permissions per server are: (1) servers must be able to authorize the creation
of capabilities to their objects, but we do not want them to have access to
permissions for another server and (2) this reduces the memory fragmentation
308
Fig. 8. Access Control Data: (1) server permissions table which holds the permis-
sions of each task organized per server, so they may be mapped read-only to the server
for authorizing capability creation; (2) server capabilities table which contains the ca-
pabilities the server has created for each task; and (3) valid references table which tells
which server capabilities have been authorized for use by which tasks.
that w^ould occur if the permissions were stored per task and server. However,
the result is that all reference monitors have access to the same server permission
table pages, so some concurrency control on update is required. Note that the
servers have read-only copies of their permissions, so the reference monitors have
control over what the system policy is.
Each server and reference monitor may maintain their own server capabilities
table in which capabilities to the server's objects are stored. Obviously, the refer-
ence monitor's copy of a server's capabilities is redundant, but may be necessary
because a server may not be trusted to maintain its own capability information
securely (i.e., an authorization mechanism must ensure that the security data is
tamperproof). For trusted servers, a single copy of its capabilities may be stored
and used by reference monitors as well (read-only). When a capability is created,
the servers return capability references to the holder of the capability. This refer-
ence refers to the index of the capability in the server's server capabilities table.
The reference monitors may obtain read-only access to the server capabilities
table which they use to authorize delegations and revoke capabilities to objects
when permissions are removed.
RM \2 4/ RM
2
•/ X
r N
\ ' Task
f 1
^
\
Server
Task Server
RM \2 RM
3 /
•/.
Task
^
3
Server Task
1 V
Server
'
server is trusted to maintain its capabilities, then no reference faults will occur
because the reference monitor will have access to these capabilities.
If the operation is authorized by the reference monitor any capability ref-
erences in the request are also passed to the server (step 2). This delegation is
authorized by the server's reference monitor when the server tries to use them.
Since the capability reference bitmap will indicate that the capability is invalid,
the server must have a permission that enables it to use that capability. The
server authorization library ensures that all capabilities for the same server ob-
ject are stored at the same reference, so it is not possible to get a false positive.
The server uses the authorization library to authorize the operation requested
using the system's security policy as well as its own. First, it checks its server
capabilities table for the principal and capability. As described above, a capabil-
ity may have been delegated to the requesting task by another task. In this case,
the server authorizes the creation of a capability using the server permissions for
the requesting teisk. The server may enforce additional security requirements on
the client task which it specifies to the authorization library.
If the server generates a capability as a result of the operation, it uses the
server authorization library to authorize the creation (using the server permis-
sions) and to add it to the server capabilities table for the task. The server
returns a capability reference which is an index to the capability in the server
capabilities table.
312
6 Conclusions
A system that composes services and applications from modules that are dynam-
ically downloaded must address the possibility that the modules used may not be
fully trusted. However, these may be binary modules, so a wide variety of system
compromises are possible. The fundamental means for controlling the operations
that a module can perform is access control. Access control consists of two main
functions: (1) access control policy specification and (2) access control policy
enforcement. In this paper, we survey the representations for flexible access con-
trol policy specification and mechanisms for enforcing these policies to identify
their useful features. We then describe how the Lava Security Architecture takes
advantage of these useful features.
The main concern in the composition of applications and services from mod-
ules is that the permissions associated with a module are consistent with the
module's purpose (i.e., least privilege). Current systems associate permissions
either based on the executor of the module (e.g., in operating systems) or the
provider and name of the module (e.g., in language-based systems). However,
this information does not describe the permissions that are consistent with the
application or service's state and the role of the module in the application or
service. We develop a role-based access control model that enables permissions
' Some denial of service attacks may also be thwarted by reference monitors, so the
removal of a reference monitor also implies that the server is not susceptible to these
attack
313
Acknowledgements
I would like to acknowledge the support of the Lava project team, particularly,
Kevin Elphinstone, Jochen Liedtke, Seva Panteleenko, and Yoon Park. Other
helpful direction has been provided by Ed Felten, Peter Honeyman, Li Gong,
Atul Prakash, Avi Rubin, Jonathan Shapiro, Leendert van Doom, and Dan
Wallach.
References
5. E. Bertino, E. Ferrari, and V. Atlttri. A flexible model for the specification and
enforcement of role-based authorizations in workflow management systems. In
Proceedings of the Second ACM Role-Based Access Control Workshop, November
1997.
6. M. Bishop and M. Dilger. Checking for race conditions in file accesses. Computing
Systems, 9(2):131-152, 1996.
7. W. E. Boebert and R. Y. Kain. A practical alternative to hierarchical integrity
policies. In Proceedings of the 8th National Computer Security Conference, pages
18-27, 1985.
8. D. F. C. Brewer and M. J. Nash. The Chinese Wall security policy. In Proceedings
of IEEE Symposium on Security and Privacy, pages 206-214, 1989.
9. J. S. Chase, H. M. Levy, M. J. Feeley, and E. D. Lazowska. Sharing and protec-
tion in a single-address-space operating system. ACM Transactions on Computer
Systems, 12(4):271-307, November 1994.
10. J. B. Dennis and E. C. Van Horn. Programming semantics for multiprogrammed
computations. Communications of the ACM, 9(3):143-155, March 1966.
11. S. Dorwaid, R. Pike, and P. Winterbottom. Inferno: la commedia interattiva, 1996.
Available from inferno.bell-labs.com.
12. D. Engler, F. Kaashoek, and J. O'Toole. Exokernel: An operating system ar-
chitecture for application level resource management. In Proceedings of the 15th
Symposium on Operating Systems Principles, pages 251-266, December 1995.
13. J. G. Mitchell et al. An overview of the Spring system. In Proceedings of Compcon,
February 1994.
14. B. Ford, G. Back, G. Benson, J. Lepreau, A. Lin, and O. Shivers. The Flux OSKit:
A substrate for kernel and language research. In Proceedings of the 16th Symposium
on Operating Systems Principles, pages 38-51, 1997.
15. L. Giuri and P. Iglio. Role templates for content-based access control. In Proceed-
ings of the Second ACM Role-Based Access Control Workshop, November 1997.
16. Y. Goldberg, M. Safran, and E. Shapiro. Active Mail - a framework for imple-
menting groupware. In CSCW 92 Proceedings, pages 75-83, 1992.
17. L. Gong. Java security: present and near future. IEEE Micro, 17(3):14-19, 1997.
18. L. Gong, M. Mueller, H. Prefullchandra, and R. Schemers. An overview of the
new security architecture in the Java Development Kit 1.2. In Proceedings of
the USENIX Symposium on Internet Technologies and Systems, pages 103-112,
December 1997.
19. Object Management Group. Security service specification. In CORBAservices:
Common Object Services Specification, chapter 15. November 1997. Available from
http://www.omg.org.
20. T. Jaeger, K. Elphinstone, J. Liedtke, V. Panteleenko, and Y. Park. Flexible access
control using IPC redirection. In Proceedings of the 7th Workshop on Hot Topics
in Operating Systems, 1999. To appear.
21. T. Jaeger, J. Liedtke, and N. Islam. Operating system protection for fine-grained
programs. In Proceedings of the 7th USENIX Security Symposium, pages 143-156,
January 1998.
22. T. Jaeger and A. PraJiash. Support for the file system security requirements of
computational e-mail systems. In Proceedings of the 2nd ACM Conference on
Computer and Communications Security, pages 1-9, 1994.
23. T. Jaeger, A. Prakash, J. Liedtke, and N. Islam. Flexible control of downloaded
executable content. ACM Transactions on Information System Security, May 1999.
To appear.
315
24. T. Jaeger, A. Rubin, and A. Prakash. Building systems that flexibly control down-
loaded executable content. In Proceedings of the 6th USENIX Security Symposium,
pages 131-148, July 1996.
25. G. Karjoth. Authorization in CORBA security. In Proceedings of ESORICS '98,
1998.
26. M. Knister and A. Prakash. Issues in the design of a toolkit for supporting multiple
group editors. Computing Systems, 6(2):135-166, 1993.
27. B. Lampson. Protection. ACM Operating Systems Review, 8(l):18-24, January
1974.
28. J. Liedtke. Clans & chiefs. In Architektur von Rechensystemen. Springer-Verlag,
March 1992. In English.
29. J. Liedtke. Improving IPC by kernel design. In Proceedings of the 14th Symposium
on Operating Systems Principles, pages 175-187, 1993.
30. J. Liedtke, N. Islam, and T. Jaeger. Preventing denial-of-service attacks on a
/i-kernel for WebOSes. In Proceedings of the Sixth Workshop on Hot Topics in
Operating Systems, pages 73-79, May 1997.
31. E. C. Lupu and M. Sloman. Reconciling role-based management and role-based
access control. In Proceedings of the Second ACM Role-Based Access Control Work-
shop, November 1997.
32. S. D. Majewski. Distributed programming: Agentware/ componentware/ dis-
tributed objects. Available at h t t p : / / minsky.med.virginia.edu/ sdm7g/ Projects/
Python/ SafePython.html.
33. S. E. Minear. Providing policy control over object operations in a Mach-based
system. In Proceedings of the 5th USENIX Security Symposium, 1995.
34. N. H. Minsky and V. Ungureanu. Unified support for heterogenous security policies
in distributed systems. In Proceedings of the 7th USENIX Security Symposium,
pages 131-142, January 1998.
35. J. K. Ousterhout, J. Y. Levy, and B. B. Welch. The Safe-Tel security model. In
Proceedings of the 23rd USENIX Annual Technical Conference, 1998.
36. R. Rashid, A. Tevanian Jr., M. Young, D. Golub, D. Baron, D. Black, W. J.
Bolosky, and J. Chew. Machine-independent virtual memory management for
paged uniprocessor and multiprocessor architectures. IEEE Transactions on Com-
puters, 37(8):896-908, August 1988.
37. J. H. Saltzer and M. D. Schroeder. The protection of information in computer
systems. Proceedings of the IEEE, 63(9):1278-1308, September 1975.
38. R. Sandhu. Rationale for the RBAC96 family of access control models. In Pro-
ceedings of the 1st Workshop on Role-Based Access Control, 1995.
39. R. Sandhu. Role activation hierarchies. In Proceedings of the Third Workshop on
Role-Based Access Control, 1998.
40. R. S. Sandhu, V. Bhamidipati, E. Coyne, S. Ganta, and C. Youman. The ARBAC97
model for role-based administration of roles: preliminary description and outline.
In Proceedings of the Second Workshop on Role-Based Access Control, pages 41-50,
1997.
41. R. S. Sandhu, E. J. Coyne, H. L. Feinstein, and C. E. Youman. Role-based access
control models. IEEE Computer, 29(2):38-47, February 1996.
42. M. I. Seltzer, Y. Endo, C. Small, and K. A. Smith. Deahng with disaster: Surviving
misbehaved kernel extensions. In Proceedings of the 2nd Conference on Operating
Systems Design and Implementation, pages 213-227, 1996.
43. D. Thomsen, D. O'Brien, and J. Bogle. Role based access control framework for
network enterprises. In Proceedings of the Fourteenth Computer Security Applica-
tions Conference, 1998.
316
1 Introduction
Extensible systems, such as Java [18,25] or SPIN [6], promise more power an
flexibility, and thus enable new applications such as smart clients [48] or ai
tive networks [44]. Extensible systems are best characterized by their suppoi
for dynamically composing units of code, called extensions in this paper. ]
these systems, extensions can be added to a running system in almost arbitral
fashion, and they interact through low-latency, but type-safe interfaces with eac
other. Extensions and the core system services are typically co-located within tl
same address space, and form a tightly integrated system. Consequently, extei
sible systems differ fundamentaUy from conventional systems, such as Unix [2£
which rely on processes executing under the control of a privileged kernel.
As a result of this structuring, system security becomes an i m p o r t a n t cha
lenge, and access control becomes a fundamental requirement for the success
extensible systems. As system security is customarily expressed through prote
tion domains [22,38], an access control mechanism must:
318
- structure the system into protection domains (which are an orthogonal con-
cept to conventional address spaces),
- enforce these domains through access control checks,
- support auditing of system operations.
Furthermore, an access control mechanism must address the fact that extensions
often originate from other networked computers and are untrusted, yet execute
as an integral part of an extensible system and interact closely with other ex-
tensions.
In this paper, we present an access control mechanism for extensible systems
that meets the above requisites. We build on the idea of separating policy and
enforcement first explored by the DTOS effort [30,34,40,39], and introduce a
mechanism that not only separates policy from enforcement, but also access con-
trol from the actual functionality of the system. The access control mechanism
is based on a simple, yet powerful model for the interaction between its policy-
neutral enforcement manager and a given security policy, and is transparent to
extensions and the core system services in the absence of security violations.
It works by inspecting extensions for their types and operations to determine
which abstractions require protection, and by redirecting procedure or method
invocations to inject access control operations into the system.
The access control mechanism provides three types of access control opera-
tions. The operations are (1) explicit protection domain transfers to delineate the
protection domains of an extensible system, (2) access checks to control which
code can be executed and which arguments can be passed between protection
domains, and (3) auditing to provide a trace of system operations. The access
control mechanism works at the granularity of individual procedures (or, ob-
ject methods), and provides precise control over extensions and the core system
services alike.
Access control and its enforcement is but one aspect of the overall security
of an extensible system. Other important issues, such as the specification of
security policies, or the expression and transfer of credentials for extensions
are only touched upon or not discussed at all in this paper. Furthermore, we
assume the existence of some means, such as digital signatures, for authenticating
both extensions and users. These issues are orthogonal to access control, and we
believe that a simple, yet powerful access control mechanism, as presented in
this paper, can serve as a solid foundation for future work on other aspects of
security in extensible systems.
The remainder of this paper is structured as follows: Section 2 elaborates
on the goals of our access control mechanism, and Sect. 3 describes its design.
Section 4 presents the implementation of our access control mechanism within
the SPIN extensible operating system. Section 5 reflects on our experiences with
designing and implementing our access control mechanism, and Sect. 6 presents
a detailed performance analysis of the implementation. Section 7 reviews related
work, and Sect. 8 outlines future directions for our research into the security of
extensible systems. Finally, Sect. 9 concludes this paper.
319
2 Goals
An access control mechanism for an extensible system must impose additional
structure onto the system. But, at the same time, it should only impose as much
structure as strictly necessary to preserve the advantages of an extensible system.
Based on this realization, we identify four goals which inform the design of our
system.
Separate access control and functionality. The access control mechanism
should separate policy and enforcement from the actual code of the system and
extensions. This separation of access control and functionality supports chang-
ing security policies without requiring access to source code. This is especially
important for large computer networks, such as the Internet, where the same
extension may execute on different systems with different security requirements,
and where source code typically is not available. This goal does not prevent the
programmer who writes an extension from defining (part of) the security policy
for that extension. However, it calls for a separate specification of such policy,
comparable to an interface specification which offers a distinct and concise de-
scription of the abstractions found in a unit of code. This policy specification
may then be loaded into an extensible system as the extension is loaded.
Separate policy and enforcement. The mechanism should separate the security
policy from its actual enforcement. This separation of policy and enforcement
allows for changing security policies without requiring intrinsic changes to the
core services of the extensible system itself. Rather, the security policy is pro-
vided by a (trusted) extension, and, as a result, the access control mechanism
leverages the advantages of an extensible system and becomes extensible itself.
Use a simple, yet expressive model. The mechanism should rely on a simple
model of protection that covers a wide range of possible security policies, includ-
ing policies that change over time or depend on the history of the system. This
goal ensures that the access control mechanism can strictly enforce a wide range
of security policies, and that the security policy has control over all relevant
aspects of access control. At the same time, it favors simplicity over complex
interactions between security policy and its enforcement.
Enforce transparently. The mechanism should be transparent to extensions
and the core system services, in that they should not need to interact with it
as long as no violations of the security policy occur. This goal ensures that
the mechanism actually provides a clean separation of security policy, enforce-
ment, and functionality. Furthermore, it provides support for legacy code (to
a degree), and enables aggressive, policy-specific optimizations that reduce the
performance overhead of access control. At the same time, it guarantees that
extensions are notified of security faults, and can implement their own failure
model. Consequently, this goal attempts to reduce the access control interface,
as seen by extensions, to handling a program fault such as division by zero or
dereferencing a NIL reference.
The above four goals, taken together, call for a design that isolates functional-
ity, security policy, and enforcement in an extensible system, and that provides a
clear specification for their interaction. In other words, the goals call for an access
320
control mechanism that combines the extension itself, the security constraints
for the extension as specified by the programmer, and a site's security policy
to produce a secure extension. At the same time, the mechanism is not limited
to changing only the extension as a result of this combination process, but can
impose security constraints on other parts of the extensible system as well. This
process of combining functionality and security to provide access control in an
extensible system is illustrated in Fig. 1.
Security
Policy
Security
Constraints Access
Control
Mechanism
Extension
Secure
Extension
Extensible System
Fig. 1. Overview of access control in an extensible system. The access control mecha-
nism combines the extension itself, the security constraints for the extension as specified
by the programmer, and a site's security policy to place a secure version of the extension
into the extensible system.
A design that addresses the four goals effectively defines the protocol by which
the security policy and the access control mechanism interact, and by which, if
necessary, extensions are notified of security-relevant events. As such, this proto-
col is an internal protocol. In other words, the abstractions used for expressing
protection domains and access control checks need not be, and probably should
not be, the same abstractions presented by the security policy. It is the respon-
sibility of a security policy manager to provide users and system administrators
with a high-level and user-friendly view of system security.
2.1 Examples
As long as extensions, such as Java applets in the sandbox model, use only a few,
selected core services, providing protection in an extensible system reduces to
isolating extensions from each other and performing access control checks in the
core services. However, for many real-world applications of extensibility, such
a protection scheme is clearly insufficient as extensions use some parts of the
system and, in turn, are used by other parts. For example, an extension may
321
3 Design
The design of our access control mechanism divides access control in an exten-
sible system into an enforcement manager and a security policy manager. The
enforcement manager is part of the core services of the extensible system. It
provides information on the types and operations of an extension, and redirects
procedure or method invocations to perform access control operations. The se-
curity policy manager is provided by a trusted extension, and determines the
actual security policy for the system. It decides which procedures require which
access control operations, and performs the actual mediation for access control.
This structure is illustrated in Fig. 2.
1
1
Security Policy
Manager Extensions
Enforcement
Manager
Reflection,
Interposition Core System
Fig. 2. Structure of the access control mechanism. The enforcement manager is part
of the core system services, provides information on the types and operations of an
extension (reflection), and redirects procedure or method invocations (interposition) to
ensure that a given security policy is actually enforced onto the system. The security
policy manager is a trusted extension, determines which abstractions require which
access control operations, and performs the actual mediation.
The protocol that determines the interaction between the enforcement and
the security policy manager relies on two abstractions, namely security identi-
fiers and sets of permissions, or access modes. Security identifiers are associated
with both subjects and objects, and represent privilege and access constraints.
Permissions are associated with operations, and represent the right to perform
an operation. The enforcement manager maintains the association of subjects
and objects with security identifiers, and performs access control checks based
on access modes. But, it does not interpret security identifiers and access modes,
as their meaning is determined by the security policy manager which performs
the actual mediation.
As extensible systems feature a considerably different structuring from tra-
ditional systems such as Unix, it is necessary to define the exact meaning of
subjects and objects. We treat threads in an extensible system as subjects, as
they are the only active entities, and all other entities, including extensions, as
323
objects. This is not to say that subjects only represent the principal that cre-
ated a thread. Rather, the rights of a subject depend on the current protection
domain, i.e. the extension whose code the thread is currently executing, and,
possibly, on previous protection domains, i.e. the history of extensions whose
code the thread executed before entering the current extension. Furthermore,
while we treat extensions as objects, they are subject to a somewhat different
form of access control than other objects in an extensible system.
The enforcement manager supports three types of access control operations. The
operations are (1) protection domain transfers to structure the system into pro-
tection domains, (2) access control checks to enforce these protection domains,
and (3) auditing to provide a trace of system operations. Protection domain
transfers change the protection domain associated with a thread, based on the
current protection domain of a thread and on the procedure that is about to
be invoked. Access checks determine whether the current subject is allowed to
execute a procedure at all, and control the passing of arguments and results.
For each argument that is passed into a procedure, and for each result that is
passed back from the procedure, access checks determine whether the subject
has sufficient rights for the corresponding object. Finally, auditing generates a
log-entry for each procedure invocation, and serves as an execution trace of the
system.
When instructing the enforcement manager to perform access control oper-
ations on a given procedure, the security policy manager specifies the types of
access control operations, i.e. any combination of protection domain transfer,
access checks, and auditing. For access checks, it also specifies the required ac-
cess modes, one for the procedure itself, one for each argument, and one for each
result.
The access control operations are ordered as follows. Before a given procedure
is executed, the enforcement manager first performs access checks, then a pro-
tection domain transfer, and, finally, auditing, which also records failed access
checks. On return from the procedure, the enforcement manager first performs
the reverse protection domain transfer, then access checks on the results, and,
finally, auditing, which, again, also records failed access checks.
To perform the access control operations, the enforcement manager requires
three mappings between security identifiers, types, and access modes. These
mappings are used to communicate a site's security policy between security
policy manager and enforcement manager. Using SID for security identifiers.
T Y P E for types as defined by the extensible system, and A C C E S S M O D E for access
modes, the three mappings are:
The first mapping is used for protection domain transfers. It maps the current
security identifier of a thread and the security identifier of the procedure that is
about to be called into the new security identifier of the thread. The enforcement
manager associates the thread with the new security identifier before control
passes into the actual procedure, and it restores the original security identifier
upon completion of the procedure.
The second mapping is used for access checks. It maps the security identifier
of a thread and the security identifier of an object into an access mode represent-
ing the max;imum rights the subject has on the object. The enforcement manager
325
verifies that the maximal access mode contains all permissions of the required
access mode, as specified by the security policy manager when requesting the
access check.
The third mapping is used for the creation of objects. It maps the security
identifier of a thread that is about to create an object and the type of that
object into the security identifier for the newly created object. The enforcement
manager associates newly created objects with the resulting security identifier.
A simplification of this mapping may omit the object type from the mapping,
and simply map all objects created by a thread into the same security identifier.
New subjects, that is freshly spawned threads, are associated with the same
security identifier as the spawning thread so that they possess the same privi-
leges. An exception to this rule occurs for threads that are created to link and
initialize extensions (as discussed above), and for threads that are created when
a user logs into the system. In the latter case, an appropriate form of authenti-
cation (such as a password) establishes the identity of the user to the security
policy manager, and the enforcement manager associates the thread with the
corresponding security identifier.
4 Implementation
We have implemented our access control mechanism in the SPIN extensible
operating system [6]. Our access control mechanism does not depend on features
that are unique to SPIN, and could be implemented in other systems. It requires
support for dynamically loading and linking extensions, for multiple concurrent
threads of execution, for determining an extension's types and operations, and
for redirecting procedure or method invocations (for example, by patching object
jump tables either statically or dynamically). Consequently, our access control
mechanism can be implemented in other extensible systems that provide these
features, such as Java.
Our implementation is guided by three constraints. First, it has to correctly
enforce a given security policy as defined by the security policy manager. Second,
it has to be simple and well-structured to allow for validation^ and for easy
transfer to other systems. Third, the implementation should be fast to impose
as little performance overhead as possible.
In SPIN, a statically linked core provides most basic services, including hard-
ware support, the Modula-3 runtime [42,21], the linker/loader [41], threads, and
the event dispatcher [35]. All other services, including networking and file system
support, are provided by dynamically linked extensions. We have implemented
the basic abstractions of our access control mechanism, such as security identi-
fiers and access modes, as well as the enforcement manager as part of this static
core.
Services in the static core are trusted in that, if they misbehave, the security
of the system can be undermined, and the system may even crash. At the same
time, the static core must be protected against dynamically linked extensions
which usually are not trusted. Consequently, the enforcement manager imposes
access control on the core services, including the linker/loader as described in
Sect. 3.1, to protect itself and other core services, and to ensure that only a
trusted extension can define the security policy. User-space applications in SPIN
need not be written in Modula-3, are not guaranteed to be type-safe, and thus
are generally untrusted. They can not access any kernel-level objects directly,
but only through a narrowly defined system call interface, which automatically
subjects them to our access control mechanism.
The implementation of our access control mechanism consists of 1000 lines
of well-documented Modula-3 interfaces and 2400 lines of Modula-3 code, with
an additional 50 lines of changes to other parts of the static SPIN core. The im-
plementation uses the Modula-3 runtime to determine the types and operations
of an extension, and the event dispatcher [35] to inject access control operations
into the system. It defines the abstractions for security identifiers and access
modes. Security identifiers are simply integers. Access modes are immutable ob-
jects, and are represented by a set of simple, pre-defined permissions in addition
to a list of permission objects. The simple permissions provide 64 permissions at
We have not validated the implementation. However, a critical chaxacteristic for any
security mechanism is that it be small and well-structured [38].
327
a low overhead. The hst of permission objects lets the security policy manager
define additional permissions (where each permission object can represent sev-
eral permissions) by subtyping from an abstract base class, at some performance
cost.
5 Discussion
By using our access control mechanism, fine-grained security constraints can be
imposed onto an extensible system. However, the expressiveness of our mecha-
nism is limited in that it can not supplant prudent interface design. In particular,
three issues arise, namely the use of abstract data types, the granularity of in-
terfaces, and the effect of calling conventions.
Our access control mechanism provides protection on objects in that it pro-
vides control over which operations a subject can legally execute on an object,
including control over which objects can be passed to and returned from an oper-
ation. To do so, it relies on abstract data types to hide the implementation of an
object. In other words, if the type of an object does not hide its implementation,
it is possible to directly access and modify an object without explicitly invoking
any of the corresponding operations and thus without incurring access control.
The structure of an interface also influences the degree of control attainable
over the operations on an object. In particular, the granularity of an interface,
i.e. how an interface decomposes into individual operations on a type, deter-
mines the granularity of access control. So, an interface with only one operation,
which, like i o c t l in Unix, might use an integer argument to name the actual
operation, allows for much less fine-grained control than an interface with several
independent operations.
The calling convention used for passing arguments to a procedure or object
method affects whether argument passing can be fully controlled. Notably, call-
by-reference grants both caller and callee access to the same variable. As caller
and callee may be in different protection domains, call-by-reference effectively
creates (type-safe) shared memory. In a multi-threaded system, information can
be passed through shared memory at any time, not just on procedure invocation
and return. Consequently, caller and callee need to trust each other on the use
of this shared memory, and access checks on call-by-reference arguments are
not very meaningful. In SPIN, call-by-reference is almost always used to return
additional results from a procedure, as Modula-3 only supports one result value.
This unnecessary use of shared memory could clearly be avoided by supporting
multiple results or thread-safe calling conventions such as call-by-value/result at
the programming language level.
The three issues just discussed are directly related to our access control mech-
anism relying on an extension's interface, that is on the externally visible types
and operations of an extension, to impose access constraints. A more powerful
model could be used to express finer-grained security constraints. And, more
aggressive techniques, such as binary rewriting [45,43,19,37], could be used to
enforce these constraints in an extensible system. But such a system would also
require a considerably more complex design and implementation. At the same
time, an extension's interface is a "natural" basis for access control, as it provides
a concise and well-understood specification of what an extension exports to other
extensions and how it interacts with them. Consequently, we believe that our
access control mechanism strikes a reasonable balance between expressiveness
and complexity.
329
6 Performance Evaluation
To determine the performance overhead of our implementation, we evaluate a set
of micro-benchmarks that measure the performance of access control operations.
We also present end-to-end performance results for a web server benchmark. We
collected our measurements on a DEC Alpha AXP 133 MHz 3000/400 worksta-
tion, which is rated at 74 SPECint 92. The machine has 64 MByte of memory,
a 512 KByte unified external cache, and an HP C2247-300 1 GByte disk-drive.
In summary, the micro-benchmarks show that access control operations incur
some latency on trivial operations, while the end-to-end experiment shows that
the overall overhead of access control is in the noise.
6.1 Micro-Benchmarks
Table 1. Performance numbers for access control operations. All numbers are the mean
of 1000 trials in microseconds. Hot represents hot microprocessor cache performance
and Cold cold microprocessor cache performance.
Hot Cold
Null procedure call 0.1 0.5
Protection domain transfer 4.4 7.8
Access check on procedure 2.8 6.4
Access check on 1 argument 4.0 9.7
Access check on 2 arguments 6.7 12.0
Access check on 4 arguments 12.1 17.7
Access check on 8 arguments 24.0 29.5
at the beginning and at the end of the loop. To determine cold microprocessor
cache performance, we measure the time before and after each trial separately,
and flush both the instruction and data cache on each iteration.
Table 2 shows the instruction breakdown of the common path for protec-
tion domain transfers, excluding the overhead for the event dispatcher (which
amounts to 31 or 48 instructions, depending on the optimizations used within
the event dispatcher [35]). On a protection domain transfer, the enforcement
manager establishes the new protection domain before control passes into the
actual procedure, and restores the original protection domain upon completion
of the procedure. Before entering the procedure, the enforcement manager first
determines the security identifiers of the thread and of the procedure. Then,
based on these security identifiers, it looks up the security identifiers for the
thread and new objects created by the thread in the mediation cache, which re-
quires obtaining a lock for the cache. Next, it sets up a new exception frame, so
that the original protection domain can be restored on an exceptional procedure
exit. Finally, it pushes a new record containing the security identifiers for the
thread and new objects onto the thread's security identifier stack. After leaving
the procedure, the enforcement manager pops the top from the thread's security
identifier stack and removes the exception frame.
Additional experiments show that performing a protection domain transfer
in addition to access checks adds 3.9 microseconds to hot cache performance
and 5.6 microseconds to cold cache performance for those of the above bench-
marks that perform access checks. Furthermore, using permission objects instead
of simple permissions for access checks, where the required permission object
matches the tenth object in the list of legal permission objects (which represents
a pessimistic scenario as each permission object can stand for dozens of indi-
vidual permissions), adds 6.8 microseconds for hot cache performance and 7.0
microseconds for cold cache performance per argument.
The performance results show that access control operations have noticeable
overhead. They thus back our basic premise that access control for extensible
systems should only impose as much structure as strictly necessary. Furthermore,
331
Table 2. Instruction breakdown of the common path for protection domain transfers
excluding the cost for the event dispatcher. "Overhead" is the overhead of performing
both protection domain changes within their own procedure. The other operations are
explained in the text.
Operation # Instr.
they underline the need for a design that enables dynamic optimizations which
avoid access control operations whenever possible.
control checks on both the NFS client and the local cache, and since the indi-
vidual threads (spawned to serve requests) can only communicate through NFS
and the local cache, the policy ensures that only authorized files are accessible
through the web server. Furthermore, it makes it possible to securely change
privileges on a per-request basis, either based on a remote login, or based on the
machine from which the request originated.
Our performance benchmark sends http requests from one machine that is
running the benchmark script to another that is running the web server. It reads
the entire SPIN web tree, to a total of 79 files or 5035 KByte of data. We run the
benchmark without access control, as a baseline, as well as with access control,
to measure the end-to-end overhead of our access control mechanism. For each
measurement, we first perform 15 runs of the benchmark to pre-warm the local
cache, and then measure the latency for 20 runs. The average latency for one run
of the benchmark both without and with access control is 16.9 seconds (including
5.4 seconds idle time on the machine running the web server), and the difference
between the two is in the noise. Trials with access control incur a total of 1573
access checks, on average 20 for each file.
The end-to-end performance experiments show that the overhead of access
control operations is negligible for a web server workload. We extrapolate from
this result that other applications will see a very small overhead for other real-
world applications. To better quantify this overhead, we plan to conduct further
experiments in the future that use more complex security policies and require
finer-grained access control operations.
7 Related Work
8 F u t u r e Work
The access control mechanism described in this paper provides us with an ideal
test-bed for future research on the security of extensible systems. Specifically,
the policy-neutral and transparent enforcement manager, with its ability to ar-
bitrarily inject protection domains and access checks into an extensible system,
offers us considerable power and flexibility. We are particularly interested in
three areas for future work: First, programmers and security administrators need
to be able to specify security constraints for the code they write and use. We
thus plan to investigate appropriate specification languages that are both user-
friendly (i.e., present a high-level of abstraction) and sufficiently powerful to
conveniently express detailed security policies (i.e., provide enough flexibility).
Second, as extensions often execute in networked environments, a protocol for
the secure expression and transfer of credentials is required. We thus intend to
examine distributed authentication and authorization protocols, such as those
described in [23,4,14], in the context of extensible systems. Finally, as illus-
trated by the micro-benchmarks in Sect. 6, the access control operations show a
relatively high overhead when compared to a simple procedure invocation. We
thus plan to explore aggressive optimizations that avoid dynamic access control
operations whenever possible.
9 Conclusions
The access control mechanism for extensible systems described in this paper
breaks up access control into a policy-neutral enforcement manager and a secu-
rity policy manager, and is transparent to extensions in the absence of security
violations. It structures the system into protection domains through protec-
tion domain transfers, enforces these protection domains through access control
checks, and provides a trace of system operations through auditing. It works by
inspecting extensions for their types and operations to determine what abstrac-
tions require protection, and by redirecting procedure or method invocations to
inject access control operations into the system. The access control mechanism
is based on a simple, yet powerful protocol by which the security policy and the
enforcement manager interact, and by which, if necessary, extensions are notified
of security-relevant events.
The implementation of our access control mechanism within the SPIN ex-
tensible operating system is simple, and, even though the latency of individual
access control operations can be noticeable, shows good end-to-end performance.
Based on our results, we predict that most systems will see a very small overhead
for access control, and thus consider our access control mechanism an eflFective
solution for access control in extensible systems.
Acknowledgments
The research presented in this paper was sponsored by the Defense Advanced
Research Projects Agency, the National Science Foundation and by an equipment
335
References
1. L. Badger, K. A. Oostendorp, W. G. Morrison, K. M. Walker, C. D. Vance, D. L.
Sherman, and D. F. Sterne. DTE Firewalls—Initial Measurement and Evaluation
Report. Technical Report 0632R, Trusted Information Systems, March 1997.
2. L. Badger, D. F. Sterne, D. L. Sherman, K. M. Walker, and S. A. Haghighat.
A Domain and Type Enforcement UNIX Prototype. In Proceedings of the Fifth
USENIX UNIX Security Symposium, pages 127-140, Salt Lake City, Utah, June
1995.
3. L. Badger, D. F. Sterne, D. L. Sherman, K. M. Walker, and S. A. Haghighat.
Practical Domain and Type Enforcement for UNIX. In Proceedings of the 1995
IEEE Symposium on Security and Privacy, pages 66-77, Oakland, California, May
1995.
4. E. Belani, A. Vahdat, T. Anderson, and M. Dahlin. The CRISIS Wide Area
Security Architecture. In Proceedings of the 7th USENIX Security Symposium,
San Antonio, Texas, January 1998.
5. D. E. Bell and L. J. La Padula. Secure Computer System: Unified Exposition
and Multics Interpretation. Technical Report MTR-2997 Rev. 1, The MITRE
Corporation, Bedford, Massachusetts, March 1976. Also ADA023588, National
Technical Information Service.
6. B. N. Bershad, S. Savage, P. Pardyak, E. G. Sirer, M. Fiuczynski, D. Becker,
S. Eggers, and C. Chambers. Extensibility, Safety and Performance in the SPIN
Operating System. In Proceedings of the 15th Symposium on Operating Systems
Principles, pages 267-284, Copper Mountain, Colorado, December 1995.
7. K. J. Biba. Integrity Considerations for Secure Computer Systems. Technical Re-
port MTR-3153 Rev. 1, The MITRE Corporation, Bedford, Massachusetts, April
1977. Also ADA039324, National Technical Information Service.
8. W. E. Boebert and R. Y. Kain. A Practical Alternative to Hierarchical Integrity
Policies. In Proceedings of the 17th National Computer Security Conference, pages
18-27, Gaithersburg, Maryland, 1985.
9. D. F. C. Brewer and M. J. Nash. The Chinese Wall Security Policy. In Proceedings
of the 1989 IEEE Symposium on Security and Privacy, pages 206-214, Oakland,
California, May 1989.
10. D. D. Clark and D. R. Wilson. A Comparison of Commercial and Military Com-
puter Security Policies. In Proceedings of the 1987 IEEE Symposium on Security
and Privacy, pages 184-194, Oakland, California, April 1987.
336
28. G. McGraw and E. W. Felten. Java Security: Hostile Applets, Holes and Antidotes.
Wiley Computer Publishing, John Wiley &; Sons, Inc., New York, New York, 1997.
29. M. K. McKusick, K. Bostic, M. J. Kaxels, and J. S. Quarterman. The Design
and Implementation of the 4-4BSD Operating System. Addison-Wesley Publishing
Company, Reading, Massachusetts, 1996.
30. S. E. Minear. Providing Policy Control Over Object Operations in a Mach Based
System. In Proceedings of the Fifth USENIX UNIX Security Symposium, pages
141-156, Salt Lake City, Utah, June 1995.
31. G. Morrisett, D. Walker, K. Crary, and N. Glew. Prom System F to Typed Assem-
bly Language. In Proceedings of the 25th Symposium on Principles of Programming
Languages, San Diego, California, January 1998.
32. A. C. Myers and B. Liskov. A Decentralized Model for Information Flow Control.
In Proceedings of the 16th Symposium on Operating Systems Principles, pages 129-
142, Saint-Malo, Prance, October 1997.
33. G. C. Necula and P. Lee. Safe Kernel Extensions Without Run-Time Checking.
In Proceedings of the Second Symposium on Operating Systems Design and Imple-
mentation, pages 229-243, Seattle, Washington, October 1996.
34. D, Olawsky, T. Fine, E. Schneider, and R. Spencer. Developing and Using a "Policy
Neutral" Access Control Policy. In Proceedings of the New Security Paradigms
Workshop, September 1996.
35. P. Pardyak and B. N. Bershad. Dynamic Binding for an Extensible System. In
Proceedings of the Second Symposium on Operating Systems Design and Imple-
mentation, pages 201-212, Seattle, Washington, October 1996.
36. J. Richardson, P. Schwarz, and L.-F. Cabrera. CACL: Efficient Fine-Grained Pro-
tection for Objects. In Proceedings of the Conference on Object-Oriented Pro-
gramming Systems, Languages, and Applications '92, pages 263-275, Vancouver,
Canada, October 1992.
37. T. Romer, G. Voelker, D. Lee, A. Woman, W. Wong, H. Levy, B. N. Bershad, and
B. Chen. Instrumentation and Optimization of Win32/Intel Executables Using
Etch. In Proceedings of the USENIX Windows NT Workshop, pages 1-8, Seattle,
Washington, August 1997.
38. J. H. Saltzer and M. D. Schroeder. The Protection of Information in Computer
Systems. Proceedings of the IEEE, 63(9):1278-1308, September 1975.
39. Secure Computing Corporation. DTOS General System Security and Assurability
Assessment Report. Technical Report DTOS CDRL AOll, Secure Computing
Corporation, Secure Computing Corporation, 2675 Long Lake Road, Roseville,
Minnesota 55113-2536, June 1997.
40. Secure Computing Corporation. DTOS Lessons Learned Report. Technical Report
DTOS CDRL A008, Secure Computing Corporation, Secure Computing Corpora-
tion, 2675 Long Lake Road, Roseville, Minnesota 55113-2536, June 1997.
41. E. G. Sirer, M. Fiuczynski, P. Pardyak, and B. N. Bershad. Safe Dynamic Linking
in an Extensible Operating System. In Proceedings of the Workshop on Compiler
Support for System Software, pages 134-140, Tucson, Arizona, February 1996.
42. E. G. Sirer, S. Savage, P. Pardyak, G. P. DeFouw, M. A. Alapat, and B. N. Bershad.
Writing an Operating System with Modula-3. In Proceedings of the Workshop on
Compiler Support for System Software, pages 141-148, Tucson, Arizona, February
1996.
43. A. Srivastava and A. Eustace. ATOM: A System for Building Customized Program
Analysis Tools. In Proceedings of the ACM SIGPLAN '94 Conference on Program-
ming Language Design and Implementation, pages 196-205, Orlando, Florida, June
1994.
338
Michael B. Jones
1 Introduction
1.1 Terminology
This paper was originally printed in Proceedings of the 14th ACM Symposium on
Operating Systems Principles, pages 80-93. Asheville, NC, December, 1993 ©1993
ACM
340
1.2 Overview
This paper presents a toolkit that substantially increases the ease of interposing
user code between clients and instances of the system interface by allowing such
code to be written in terms of the high-level objects provided by this interface,
rather than in terms of the intercepted system calls themselves. Providing an
object-oriented toolkit exposing the multiple layers of abstraction present in
the system interface provides a useful set of tools and interfaces at each level.
Different agents can thus exploit the toolkit objects best suited to their individual
needs. Consequently, substantial amounts of toolkit code are able to be reused
when constructing different agents. Furthermore, having such a toolkit enables
new system interface implementations to be written, many of which would not
otherwise have been attempted.
Just as interposition is successfully used today to extend operating system
interfaces based on such communication-based facilities as pipes, sockets, and
inter-process communication channels, interposition can also be successfully used
to extend the system interface. In this way, the known benefits of interposition
can also be extended to the domain of the system interface.
1.3 Examples
The following figures should help clarify both the system interface and interpo-
sition.
mail make
9WK^ssomiy*^^iaoxismxm,\
Operating System Kernel
Application Program
WyAOrAmim^m«/AyjrMgmiK\
WSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSi
Figure 2 depicts the ability to transparently interpose user code that both
uses and implements the operating system interface between an unmodified ap-
plication program and the operating system kernel.
Untrusted
emacs Binary
^Aimium'.
Compress/ Restricted
csh Uncompress Environment
weimxGiKimi^ryyyyyjyryjxm V^;j?jj???JWA
Operating System Kernel
Figure 3 depicts uses of the system interface with interposition. Here, both
the kernel and interposition agents provide instances of the operating system
interface.
342
HP-UX
Compiler
v/xf/xofxaa^^
HP-UX
make Emulator
'W/////////A V^>^WWM>y
mail Agent Implementing
Customized Filesystem View
Fig. 4. Agents can share state and provide multiple instances of system interface
Figure 4 depicts more uses of the system interface with interposition. In this
view agents, like the kernel, can share state and provide multiple instances of
the operating system interface.
1.4 Motivation
Today, agents are regularly written to be interposed on simple communication-
based interfaces such as pipes and sockets. Similarly, the toolkit makes it possible
to easily write agents to be interposed on the system interface.
Interposition can be used to provide programming facilities that would oth-
erwise not be available. In particular, it can allow for a multiplicity of simul-
taneously coexisting implementations of the system call services, which in turn
may utilize one another without requiring changes to existing client binaries and
without modifying the underlying kernel to support each implementation.
Alternate system call implementations can be used to provide a number of
services not typically available on system call-based operating systems. Some
examples include:
— System Call Tracing and Monitoring Facilities: Debuggers and pro-
gram trace facilities can be constructed that allow monitoring of a program's
use of system services in a easily customizable manner.
— Emulation of Other Operating Systems: Alternate system call imple-
mentations can be used to concurrently run binaries from variant operating
systems on the same platform. For instance, it could be used to run UL-
TRIX [13], HP-UX [10], or UNIX System V [3] binaries in a Mach/BSD
environment.
— Protected Environments for Running Untrusted Binaries: A wrap-
per environment can be constructed that allows untrusted, possibly mali-
cious, binaries to be run within a restricted environment that monitors and
343
emulates the actions they take, possibly without actually performing them,
and limits the resources they can use in such a way that the untrusted bina-
ries are unaware of the restrictions. A wide variety of monitoring and emu-
lating schemes are possible from simple automatic resource restriction envi-
ronments to heuristic evaluations of the target program's behavior, possibly
including interactive decisions made by human beings during the protected
execution. This is particularly timely in today's environments of increased
software sharing with the potential for viruses and Trojan horses.
— Transactional Software Environments: Applications can be constructed
that provide an environment in which changes to persistent state made by
unmodified programs can be emulated and performed transactionally. For
instance, a simple "run transaction" command could be constructed that
runs arbitrary unmodified programs (e.g., / b i n / c s h ) such that all persistent
execution side effects (e.g., filesystem writes) are remembered and appear
within the transactional environment to have been performed normally, but
where in actuality the user is presented with a "commit" or "abort" choice at
the end of such a session. Indeed, one such transactional program invocation
could occur within another, transparently providing nested transactions.
- Alternate or Enhanced Semantics: Environments can be constructed
that provide alternate or enhanced semantics for unmodified binaries. One
such enhancement in which people have expressed interest is the ability to
"mount" a search list of directories in the filesystem name space such that
the union of their contents appears to reside in a single directory. This could
be used in a software development environment to allow distinct source and
object directories to appear as a single directory when running make.
The key insight that enabled me to gain leverage on the problem of writing sys-
tem interface interposition agents for the 4.3BSD [25] interface is as follows: while
the 4.3BSD system interface contains a large number of different system calls,
it contains a relatively small number of abstractions ('whose behavior is largely
independent). (In 4.3BSD, the primary system interface abstractions are path-
names, descriptors, processes, process groups, files, directories, symbolic links,
pipes, sockets, signals, devices, users, groups, permissions, and time.) Further-
more, most calls manipulate only a few of these abstractions.
Thus, it should be possible to construct a toolkit that presents these ab-
stractions as objects in an object-oriented programming language. Such a toolkit
would then be able to support the substantial commonalities present in different
agents through code reuse, while also supporting the diversity of different kinds
of agents through inheritance.
2 Research Overview
2.1 Design Goals
Figure 5 presents a diagram of the primary classes currently provided with the
interposition toolkit. Indented classes are subclasses of the classes above. Arrows
indicate the use of one class by another. Many of these classes are explained in
more detail in this section.
346
numeric_syscall
bsd_numeric_syscall
i
symbolic_syscall
_desc_symbolic_syscall
path_symbolic_syscall
descriptor_set pathname_s et
open_descriptor_set
descriptor pathname
open_descriptor
open_object
directory
The lowest layers of the toolkit perform such functions as agent invocation,
system call interception, incoming signal handling, performing system calls on
behalf of the agent, and delivering signals to applications running under agent
code. Unlike the higher levels of the toolkit, these layers are sometimes highly
operating system specific and also contain machine specific code.
These layers hide the mechanisms used to intercept system calls and signals,
those that are used to call down from an agent to the next level system interface,
and those that are used to send a signal from an agent up to the application
program. These layers also hide such details as whether the agent resides in the
same address space as the application program or whether it resides in a separate
address space. These layers are referred to as the boilerplate layers. These layers
are not normally used directly by interposition agents.
The lowest (or zeroth) layer of the toolkit which is directly used by any in-
terposition agents presents the system interface as a single entry point accepting
vectors of untyped numeric arguments. It provides the ability to register for spe-
cific numeric system calls to be intercepted and for incoming signal handlers to
be registered. This layer is referred to as the numeric system call layer.
Example interfaces provided by the numeric system call layer are as follows:
347
class immeric_syscall •[
public:
virtual int syscall(int number, int args[],
int rv[2], void *regs);
virtual void init(char *agentargv[]);
virtual void signal_handler(int sig, int code,
struct sigcontext *context);
void register_interest(int number);
void register_interest_range(int low, int high);
};
For instance, using just the numeric system call layer, by using a derived
version of the numeric_syscall class with an agent-specific s y s c a l l O method,
an agent writer could trivially write an agent that printed the arguments of a
chosen set of system calls as uninterpreted numeric values. As another example,
one range of system call numbers could be remapped to calls on a different range
at this level.
The first layer of the toolkit intended for direct use by most interposition
agents presents the system interface as a set of system call methods on a system
interface object. When this layer is used by an agent, application system calls are
mapped into invocations on the system call methods of this object. (This map-
ping is itself done by a toolkit-supplied derived version of the numeric_syscall
object.) This layer is referred to as the symbolic system call layer.
Example interfaces provided by the symbolic system call layer are as follows:
c l a s s symbolic_syscall {
public:
v i r t u a l void i n i t ( c h a r * a g e n t a r g v [ ] ) ;
v i r t u a l void i n i t _ c h i l d ( ) ;
v i r t u a l i n t unknown_syscall(int number, i n t * a r g s , i n t r v [ 2 ] ,
s t r u c t emul_regs * r e g s ) ;
v i r t u a l void s i g n a l _ h a n d l e r ( i n t s i g , i n t code,
s t r u c t sigcontext * c o n t e x t ) ;
};
t r a c e agent, described in Section 3.3, prints the arguments to each executed sys-
tem call in a human-readable from individual system call methods in a derived
syinbolic_syscall object.
The second layer of the toolkit is structured around the primary abstrac-
tions provided by the system call interface. In 4.3BSD, these include path-
names, file descriptors, processes, and process groups. This layer presents the
system interface as sets of methods on objects representing these abstractions.
Toolkit objects currently provided at this level are the filesystem name space
(pathncune_set), resolved pathnames (pathname), the file descriptor name space
( d e s c r i p t o r _ s e t ) , active file descriptors (descriptor), and reference counted
open objects (open_object). Such operations as filesystem name space trans-
formations and filesystem usage monitoring are done at this level.
For example, agents can interpose on pathname operations by using derived
versions of two classes: pathname_set and pathname. The pathname_set class
provides operations that affect the set of pathnames, i.e., those that create or
remove pathnames. The pathname class provides operations on the objects ref-
erenced by the pathnames.
Example interfaces provided by the pathname_set class are as follows:
c l a s s pathncime_set : p u b l i c d e s c r i p t o r _ s e t {
protected:
v i r t u a l i n t getpn(char *path, i n t f l a g s , pathname **pn);
public:
v i r t u a l void i n i t ( c h a r * a g e n t a r g v [ ] ,
c l a s s PATH_SYMBOLIC_BASE *path_sym);
simply resolves their pathname strings to pathname objects using getpnO and
then invokes the corresponding pathname method on the resulting object. The
pathname method is responsible for actually performing the requested operation
on the object referenced by the pathname.
With the getpnO operation to encapsulate pathname lookup, it is possible
for agents to supply derived versions of the pathname_set object with a new
getpnO implementation that modifies the treatment of all pathnames. For in-
stance, this can be used to logically rearrange the pathname space, as was done
by the union agent (described in Section 3.3). Likewise, it provides a central
point for name reference data collection, as was done by the df s_trace agent
(described in Section 3.5).
A third set of toolkit layers focuses on secondary objects provided by the
system call interface, which are normally accessed via primary objects. Such
objects include files, directories, symbohc links, devices, pipes, and sockets. These
layers present the system interface as sets of methods on objects, with specialized
operations for particular classes of objects. The only toolkit object currently
provided at this level is the open directory d i r e c t o r y object. Operations that
are specific to these secondary objects such as directory content transformations
are done at this level.
For example, agents can interpose on directory operations by using derived
versions of the d i r e c t o r y class. The directory class is itself a derived version of
the open_object class (one of the second layer classes for file descriptor oper-
ations), since directory operations are a special case of operations that can be
performed on file descriptors.
Example interfaces provided by the d i r e c t o r y class are as follows:
c l a s s d i r e c t o r y : public OPEN_OBJECT_CLASS {
public:
virtual int next_direntry();
s t r u c t d i r e c t * d i r e n t r y ; / / Set by n e x t _ d i r e n t r y ( )
public:
virtual int read(void *buf, int cnt, int rv[2]);
virtual int lseek(off_t offset, int whence, int rv[2]);
virtual int getdirentries(void *buf, int cnt, long *basep,
int rv[2]);
};
3 Results
Unmodified Applications
Agents constructed using the system interface interposition toolkit can load
and run unmodified 4.3BSD binaries. No recompilation or relinking is neces-
sary. Thus, agents can be used for all program binaries — not just those for
which sources or object files are available.
Applications do not have to be adapted to or modified for particular agents.
Indeed, the presence of agents should be transparent to applications.'^
Unmodified Kernel
Agents constructed using the system interface interposition toolkit do not require
any agent-specific kernel modifications. Instead, they use general system call
handling facilities that are provided by the kernel in order to implement all
agent-specific system call behavior. Also, a general agent loader program is used
to invoke arbitrary agents, which are compiled separately from the agent loader.
The Mach 2.5 kernel used for this work contains a primitive,
t a s k _ s e t _ e m u l a t i o n ( ) , that allows 4.3BSD system calls to be redirected for
execution in user space. Another primitive, h t g _ u n i x _ s y s c a l l ( ) , permits calls
to be made on the underlying 4.3BSD system call implementation even though
those calls are being redirected.
Sizes of Agents
Agent Toolkit Agent Total
Name Statements Statements Statements
timex 2467 35 2502
trace 2467 1348 3815
union 3977 166 4143
for this agent contains 2467 statements. The code specific to this agent consists of
only two routines: a new derived implementation of the gettimeofdayO system
call and an initialization routine to accept the desired effective time of day from
the command line. This code contains only 35 statements.
The core of the timex agent is as follows:
c l a s s timex_symbolic_syscall : p u b l i c symbolic_syscall {
public:
v i r t u a l void i n i t ( c h a r * a g e n t a r g v [ ] ) ;
v i r t u a l i n t sys_gettimeofday(struct timeval * t p ,
s t r u c t timezone * t z p , i n t r v [ 2 ] ) ;
private:
int offset; / / Difference between r e a l eind funky time
};
int timex_symbolic_syscall::sys_gettimeofday(
struct timeval *tp, struct timezone *tzp, int rv[2])
{
int ret;
ret = symbolic_syscall::sys_gettimeofday(tp, tzp, rv);
if (ret >= 0 && tp) {
tp->tv_sec += offset;
}
return ret;
}
The new code necessary to construct the timex agent using the toolkit con-
sists only of the implementation of the new functionality. Inheritance from toolkit
objects is used to obtain implementations of all system interface behaviors that
remain unchanged.
Size Results
The above examples demonstrate several results pertaining the code size of
agents written using the interposition toolkit. One result is that the size of
the toolkit code dominates the size of agent code for simple agents. Using the
toolkit, the amount of new code to perform useful modifications of the system
interface semantics can be small.
354
The df s.trace agent implements file reference tracing tools that aie compatible
with existing tools [30] originally implemented for use by the Coda [38, 23] filesystem
project. This agent is discussed further in Section 3.5.
355
Format my dissertation
agent Name Seconds Slowdown
None 131.5 —
timex 132.0 0.5%
trace 135.0 2.5%
union 136.5 3.5%
considered, there is only an additional 5.0 seconds, giving an effective agent cost
of 3.5% of the base run time.
It comes as no surprise that t r a c e , while conceptually simple, incurs percep-
tible overheads. Each system call made by the application to the t r a c e agent
results in at least an additional two w r i t e () system calls in order to write the
trace output.^
Make 8 programs
Agent Name Seconds Slowdown
None 16.0 ~
timex 19.0 19%
trace 33.0 107%
union 29.0 82%
When run under the simplest agent, timex, an additional three seconds of
overhead are added, giving an effective additional cost of 19% of the base run-
time. When run under union, which interposes on most of the system calls and
which uses several additional layers of toolkit abstractions, the additional over-
head beyond the no agent case is 13.0 seconds, giving an effective additional cost
^ Trace output is not buffered across system calls so it will not be lost if the process
is killed.
356
of 82% of the base runtime. When run under t r a c e , an additional 17.0 seconds
of run time are incurred, yielding a slowdown of 107%.
Again, it comes as no surprise that union introduces more overhead than
timex. It interposes on the vast majority of the system calls, unlike timex,
which interposes on only the bare minimum plus gettimeofdayO. Also, union
uses several additional layers of implementation abstractions not used by timex.
As with the previous application, the larger slowdown for t r a c e is unsur-
prising. Given the large number of system calls made by this application and
the additional two w r i t e ( ) operations performed per application system call for
writing the trace log, the log output time constitutes a significant portion of the
slowdown.
An analysis of low-level performance characteristics is presented in Sections
3.5 and 3.5.
implement interposition and of several commonly used system calls both without
and with interposition.
Table 4 presents the performance of several low-level operations used to im-
plement interposition. All measurements were taken on a 25MHz Intel 486 run-
ning Mach 2.5 version X144. The code measured was compiled with gcc or g-l-+
version 1.37 with debugging (-g) symbols present.
intercept a system call, save the register state, call a system call dispatching
routine, return from the dispatching routine, load a new register state, and return
from the intercepted system call. This provides a lower bound on the total cost
of any system call implemented by an interposition agent.
Second, using h t g _ u n i x _ s y s c a l l ( ) ^ to make a system call adds 37/isec6f
overhead beyond the normal cost of the system call. This provides a lower bound
on the additional cost for an agent to make a system call that otherwise would
be intercepted by the agent.
Thus, any system call intercepted by an agent that then makes the same
system call as part of the intercepted system call's implementation will take at
least 67/iseclonger than the same system call would have if made with no agent
present. Comparing the 67/isec6verhead to the normal costs of some commonly
used system calls (found in Table 5) helps puts this cost in perspective.
The 67/isec6verhead is quite significant when compared to the execution
times of simple calls such as g e t p i d O or gettimeof day ( ) , which take 25/isecand
47/isec; respectively, without an agent. It becomes less so when compared to
r e a d O or s t a t ( ) , which take 370/isecand 892/isecto execute in the cases mea-
sured without an agent.
Hence, the impact will always be significant on small calls that do very little
work; it can at least potentially be insignificant for calls that do real work.
In practice, of course, the overheads of actual interposition agents are higher
than the 67^sectheoretical minimum. The actual overheads for most system calls
implemented using the symbolic system call toolkit level (see Section 2.3) range
from about 140 to 210/isec; as per Table 5. Overheads for f o r k O and execveO
are significantly greater, adding approximately 10 milliseconds to both, roughly
doubling their costs.
The execveO call is more expensive than most because it must be completely
reimplemented by the toolkit from lower-level primitives, unlike most calls where
the version provided by the underlying implementation can be used. The under-
lying implementation's execveO call can not be used because it clears its caller's
address space. While the application must be reloaded, the agent needs to be
preserved. Thus, the extra expense of execveO is due to having to individually
perform such operations as clearing the caller's address space, closing a subset of
the descriptors, resetting signal handlers, reading the program file, loading the
executable image into the address space, loading the arguments onto the stack,
setting the registers, and transferring control into the loaded image, all of which
are normally done by a single execveO call. Likewise, f o r k O and _ e x i t ( )
are more expensive due to the number of additional bookkeeping operations
required.
While the current overheads certainly leave room for optimization (starting
with compiling the agents with optimization on), they are already low enough
to be unimportant for many applications and agents, as discussed in Section 3.4.
Portability
The interposition toolkit should port to similar systems such as SunOS and
UNIX System V. Despite toolkit dependencies on such Mach facilities as the
particular system call interception mechanism used, all such dependencies were
carefully encapsulated within the lowest (boilerplate) layers of the toolkit. None
of the toolkit layers above the boilerplate directly depends on Mach-specific ser-
vices. Higher toolkit layers, while being intentionally 4.3BSD specific, contain
no such dependencies. This 4.3BSD dependency imposes at most minor porta-
bility concerns to other UNIX-derived systems, given their common lineage and
resulting substantial similarity. Thus, it should be possible to port the toolkit by
replacing the Mach-dependent portions of the boilerplate layers with equivalent
services provided by the target environment.
Likewise, interposition agents written for the toolkit should also readily port.
Even if there are differences between the system interfaces, the toolkit port
should be able to shield the agents from these differences, unless, of course, the
agents are directly using the facilities which differ.
One caveat, however, is probably in order. While a port of the toolkit could
shield interposition agents from low-level system interface differences, it certainly
can not shield them from system performance differences. If the
toolkit is ported to systems that provide significantly slower system call in-
terception mechanisms (as, for instance, mechanisms based on UNIX signals are
likely to be), then some agents which previously exhibited acceptable slowdown
might exhibit unacceptable slowdown when ported.
In summary, the interposition agent was more logically structured, was prob-
ably simpler to write and modify, and required no system modifications to im-
plement or run. The kernel-based tracing tools were more efl[icient.
4 Related Work
This section presents a brief survey of past work providing the ability to inter-
pose user code at the system interface or to otherwise extend the functionality
available through the system interface. This topic does not appear to be well
361
described in the literature; despite intensive research into past systems I have
been unable to find a comprehensive treatment of the subject.
In particular, no general techniques for building or structuring system inter-
face interposition agents appear to have been in use, and so none are described.
Even though a number of systems provided mechanisms by which interposition
agents could be built, the agents that were built appear to have shared little
or no common ground. No widely applicable techniques appear to have been
developed; no literature appears to have been published describing those ad hoc
techniques that were used.
Thus, the following treatment is necessarily somewhat anecdotal in nature,
with some past interposition agents and other system extensions described only
by personal communications. Nonetheless, this section attempts to provide a
representative, if not comprehensive, overview of the related work.
5 Conclusions
This research has demonstrated that the system interface can be added to the set
of extensible operating system interfaces that can be extended through interpo-
sition. Just as interposition is successfully used today with such communication-
based facilities as pipes, sockets, and inter-process communication channels, this
work has demonstrated that interposition can be successfully applied to the sys-
tem interface. This work extends the known benefits of interposition to a new
domain.
It achieves this result through the use of an interposition toolkit that substan-
tially increases the ease of interposing user code between clients and instances
of the system interface. It does so by allowing such code to be written in terms
of the high-level objects provided by this interface, rather than in terms of the
intercepted system calls themselves.
The following achievements demonstrate this result:
5.2 Contribution
This research has demonstrated both the feasibihty and the appropriateness of
extending the system interface via interposition. It has shown that while the
4.3BSD system interface is large, it actually contains a small number of abstrac-
tions whose behavior is largely independent. Furthermore, it has demonstrated
that an interposition toolkit can exploit this property of the system interface. In-
terposition agents can both achieve acceptable performance and gain substantial
implementation leverage through use of an interposition toolkit.
These results should be applicable beyond the initial scope of this research.
The interposition toolkit should port to similar systems such as SunOS and
UNIX System V. Agents written for the toolkit should also port. The lessons
learned in building this interposition toolkit should be applicable to building
similar toolkits for dissimilar systems, as explored in [21]. For instance, interpo-
sition toolkits could be constructed for such interfaces as the MS-DOS system
interface, the Macintosh system interface, and the X Window System interface.
Today, agents are regularly written to be interposed on simple communication-
based interfaces such as pipes and sockets. Similarly, the toolkit makes it possible
to easily write agents to be interposed on the system interface. Indeed, it is an-
ticipated that the existence of this toolkit will encourage the writing of such
agents, many of which would not otherwise have been attempted.
Acknowledgments
I'd like to extend special thanks to Rick Rashid, Eric Cooper, M. Satyanarayanan
(Satya), Doug Tygar, Garret Swart, Brian Bershad, and Patricia Jones. Each of
you have have offered helpful suggestions and criticisms which have helped shape
and refine this research.
I'd also like to thank Mark Weiser for his valuable comments on ways to
improve this paper.
364
References
1. M. Accetta, R. Baxon, D. Golub, R. Rashid, A. Tevanian, and M. Young. Mach:
A new kernel foundation for UNIX development. In Proc. Summer 1986 USENIX
Technical Conference and Exhibition, June 1986.
2. Apple Computer, Inc. Macintosh System Software User's Guide Version 6.0, 1988.
3. AT&T, Customer Information Center, P.O. Box 19901, Indianapolis, IN 46219.
System V Interface Definition, Issue 2, 1986.
4. AT&T. Unix System V Release 4-0 Programmer's Reference Manual, 1989.
5. Robert V. Baron, David Black, William Bolosky, Jonathan Chew, Richard P.
Draves, David B. Golub, Richard F. Rashid, Avadis Tevanian, Jr., and
Michael Wayne Young. Mach Kernel Interface Manual. Carnegie Mellon Uni-
versity School of Computer Science, August 1990.
6. Brian N. Bershad and C. Brian Pinkerton. Watchdogs: Extending the unix iilesys-
tem. In Winter Usenix Conference Proceedings, Dallas, 1988.
7. D. G. Bobrow, J. D. Burchfiel, D. L. Murphy, and R. S. Tomlinson. TENEX,
a paged time sharing system for the PDP-10. Communications of the ACM,
15(3):135-143, March 1972.
8. D. R. Brownbridge, L. F. Marshall, and B. Randell. The Newcastle Connection,
or UNIXes of the world unite! Software - Practice and Experience, 12:1147-1162,
1982.
9. David R. Cheriton, The V distributed system. Communications of the ACM,
31(3):314-333, March 1988.
10. F. W. Clegg, G. S.-F. Ho, S. R. Kusmar, and J. R. Sontag. The HP-UX oper-
ating system on HP Precision Architecture computers. Hewlett-Packard Journal,
37(12):4-22, December 1986.
11. David S. H. Rosenthal. Evolving the vnode interface. In USENIX Conference
Proceedings, pages 107-118. USENIX, June 1990.
12. Digital Equipment Corporation. DECSYSTEM-20 Monitor Calls Reference Man-
ual, January 1978.
13. Digital Equipment Corporation. ULTRIX Reference Pages, Section 2 System Calls,
1989.
14. D. Eastlake, R. Greenblatt, J. HoUoway, T. Knight, and S. Nelson. ITS 1.5 reference
manual. Memorandum no. 161, M.I.T. Artificial IntelHgence Laboratory, July 1969.
Revised form of ITS 1.4 Reference Manual, June 1968.
15. S. I. Feldman. Make - a program for maintaining computer programs. Software -
Practice and Experience, 9(4):255-265, 1979.
16. David Golub, Randall Dean, Alessandro Forin, and Richard Rashid. Unix as an
application program. In Summer Usenix Conference Proceedings, Anaheim, June
1990.
17. Richard G. Guy, John S. Heidemann, Wai Mak, Thomas W. Page, Jr., Gerald J.
Popek, and Dieter Rothmeier. Implementation of the Ficus replicated file system.
In USENIX Conference Proceedings, pages 63-71. USENIX, June 1990.
18. John S. Heidemann. Stackable layers: an architecture for file system development.
Master's thesis, University of California, Los Angeles, July 1991. Available as
UCLA technical report CSD-910056.
19. J. H. Howard, M. L. Kazax, S. G. Menees, D. A. Nichols, M. Satyanarayanan, R. N.
Sidebotham, and M. J. West. Scale and performance in a distributed file system.
ACM TVansactions on Computer Systems, 6(1), February 1988.
365
42. J. E. Stoy and C. Strachey. 0S6 - an experimental operating system for a small
computer. Part 2: Input/output and filing system. Computer Journal, 15(3):195-
203, August 1972.
43. Howard Ewing Sturgis. A postmortem for a time sharing system. Xerox Research
Report CSL-74-1, Xerox Palo Alto Research Center, January 1974.
44. Sun Microsystems, Inc. SunOS Reference Manual, May 1988. Part No. 800-1751-
10.
45. Symantec Corporation. The Norton Antivirus, 1991.
46. Symantec Corporation. Symantec Antivirus for Macintosh, 1991.
47. Robert H. Thomas. A resource sharing executive for the ARPANET. In Proceedings
of the AFIPS National Computer Conference, volume 42, pages 155-163, June
1973.
48. Robert H. Thomas. JSYS traps - a Tenex mechanism for encapsulation of user
processes. In Proceedings of the AFIPS National Computer Conference, volume 44,
pages 351-360, 1975.
49. Trend Micro Devices, Incorporated. PC-cillin Virus Immune System User's Man-
ual, 1990.
50. D. Walsh, B. Lyon, G. Sager, J. M. Chang, D. Goldberg, S. Kleiman, T. Lyon,
R. Sandberg, and P. Weiss. Overview of the sun network filesystem. In Winter
Usenix Conference Proceedings, Dallas, 1985.
367
system calls to distinct processes, such as those provided by Solaris, System V.4,
and some versions of Mach 3.0, agents can make and enforce security guarantees.
I believe the particular interception technique to be largely independent of the
bulk of the techniques used in the Interposition Agents toolkit.
It is my hope that readers reviewing the Interposition Agents work will have
a broadened understanding of the possibilities for easily building a multiplic-
ity of coexisting custom execution environments for application code. With the
rise of applets, controls, scripts and the other forms of executable content, the
poignancy of the need for such environments seems ever more evident.
1 Introduction
The notion of moving code across the network to the most appropriate host for
execution has become commonplace. Most often code is moved for efficiency, but
sometimes it is for privacy, for fault-tolerance, or simply for convenience. The
major concern when moving code is security: the integrity of the host to which
it is moved is at risk, as well as the integrity of the computation performed by
the moved code itself.
A number of techniques have been used to place protection boundaries be-
tween so-called "untrusted code" moved to a host and the remainder of the
software running on that host. Traditional operating systems use virtual mem-
ory to enforce protection between processes. A process cannot directly read and
write other processes' memory, and communication between processes requires
traps to the kernel. By limiting the traps an untrusted process can invoke, it
can be isolated to varying degrees from other processes on the host. However,
there's little point in sending a computation to a host if it cannot interact with
other computations there; inter-process communication must be possible.
The major problems encountered when using traditional operating system
facilities to isolate untrusted code are deciding whether a specific kernel trap is
permissible or not and overcoming the cost of inter-process communication. The
semantic level of kernel traps does not generally match the level at which protec-
tion policies are specified when hosting untrusted code. In addition, the objects
on which the traps act are the ones managed by the kernel and not the ones
370
2 Language-Based P r o t e c t i o n Background
class A {
private int i;
public int j;
public static void methodlO {
A al = new A();
A a2 = new A();
B.method2(al);
>
class B {
public static void method2(A arg) {
arg.j++;
}
}
Selective Class Sharing Domains can also protect themselves through control
of their class namespace. To understand this, we need to look at Java's class
loading mechanisms. To allow dynamic code loading, Java supports user-defined
class loaders which load new classes into the virtual machine at run-time. A class
loader fetches Java bytecode from some location, such as a file system or a URL,
and submits the bytecode to the virtual machine. The virtual machine performs
a verification check to make sure that the bytecode is legal, and then integrates
373
the new class into the machine execution. If the bytecode contains references to
other classes, the class loader is invoked recursively in order to load those classes
as well.
Class loaders can enforce protection by making some classes visible to a
domain, while hiding others. For instance, the example above assumed that
classes A and B were visible to each other. However, if class A were hidden from
class B (i.e. it did not appear in B's class namespace), then even if B obtains a
reference to an object of type A, it will not be able to access the fields i and j ,
despite the fact that j is pubhc.
The simple controls over the namespace provided in Java can be used to construct
software components that communicate with each other but are still protected
from one another. In essence, each component is launched in its own namespace,
and can then share any class and any object with other components using the
mechanisms described above. While we will continue to use the term protection
domain informally to refer to these protected components, we will axgue that it
is impossible to precisely define protection domains when using this approach.
The example below shows a hypothetical file system component that gives
objects of type FileSystemlnterf ace to its clients to give them access to files.
Client domains make cross-domain invocations on the file system by invoking the
open method of a FileSystemlnterf ace object. By specifying different values
for accessRights and r o o t D i r e c t o r y in different objects, the file system can
enforce different protection policies for different clients. Static access control en-
sures that clients cannot modify the accessRights and r o o t D i r e c t o r y fields di-
rectly, and one client cannot forge a reference to another client's FileSystemlnterf ace
object.
class FileSystemlnterface {
p r i v a t e i n t accessRights;
p r i v a t e Directory r o o t D i r e c t o r y ;
p u b l i c F i l e open(String fileName) {}
}
When we first began to explore protection in Java, this share anything ap-
proach seemed the natural basis for a protection system, and we began develop-
ing on this foundation. However, as we worked with this approach a number of
problems became apparent.
class A {
p u b l i c i n t m e t h K i n t a l , i n t a2) {}
}
c l a s s AWrapper {
p r i v a t e A a;
p r i v a t e boolean revoked;
p u b l i c i n t m e t h l ( i n t a l , i n t a2) {
i f ( ! r e v o k e d ) r e t u r n a . m e t h l ( a l , a2);
e l s e throw new RevokedExceptionO;
}
p u b l i c void revoke() { revoked=true; }
p u b l i c AWrapper(A realA) {
a = realA; revoked = f a l s e ; }
}
In principle, this solves the revocation problem and is efficient enough for
most purposes. However, our experience shows that programmers often forget to
wrap an object when passing it to another domain. In particular, while it is easy
to remember to wrap objects passed as arguments, it is common to forget to wrap
other objects to which the first one points. In effect, the default programming
model ends up being an unsafe model where objects cannot be revoked. This
is the opposite of the desired model: safe by default and unsafe only in special
cases.
provides no help as it makes no distinction between the two. Yet, the distinction
is critical for reasoning about the behavior of a program running in a domain.
Mutable shared objects can be modified at any time in by other domains that
have access to the object, and a programmer needs to be aware of this possible
activity. For example, a malicious user might try to pass a byte array holding
legal bytecode to a class loader (byte arrays, like other objects, are passed by
reference to method invocations), wait for the class loader to verify that the
bytecode is legal, and then overwrite the legal bytecode with illegal bytecode
which would subsequently be executed. The only way the class loader can protect
itself from such an attack is to make its own private copy of the bytecode, which
is not shared with the user and is therefore safe from malicious modification.
Threads By simply using method invocation for cross-domain calls, the caller
and callee both execute in the same thread, which creates several potential haz-
ards. First, the caller must block until the callee returns—there is no way for the
376
caller to gracefully back out of the call without disrupting the callee's execution.
Second, Java threads support methods such as stop, suspend, and s e t P r i o r i t y
that modify the state of a thread. A malicious domain could call another do-
main and then suspend the thread so that the callee's execution gets blocked,
perhaps while holding a critical lock or other resource. Conversely, a malicious
callee could hold on to a Thread object and modify the state of the thread after
execution returns to the caller.
3 The J-Kernel
The J-Kernel is a capability-based system that supports multiple, cooperating
protection domains, called tasks, which run inside a single Java virtual machine.
Capabilities were chosen because they have several advantages over access control
lists: (i) they can be implemented naturally in a safe language, (ii) they can
enforce the principle of least privilege more easily, and (iii) by avoiding access
list lookups, operations on capabilities can execute quickly.
The primary goals of the J-Kernel are:
377
tures without providing their implementation. Other classes that provide cor-
responding implementations can then be declared to implement the interface.
Normally interface classes are used to provide a limited form of multiple in-
heritance (properly called interface inheritance) in that a class can implement
multiple interfaces. In addition, Sun's remote method invocation (RMI) speci-
fication [19] "pioneered" the use of interfaces as compiler annotations. Instead
of using a separate interface definition language (IDL), the RMI specification
simply uses interface classes that are flagged to the RMI system in that they
extend the class Remote. Extending Remote has no effect other than directing
the RMI system to generate appropriate stubs and marshalling code.
// Task 1:
// instantiate new ReadFilelmpl object
ReadFilelmpl target = new ReadFilelmpl();
// create a capability for the new object
Capability c = Capability.create(target);
// add it to repository under some naime
Task.getRepositoryO.bind(
"TasklReadFile", c ) ;
// Task 2:
/ / extract capability
C a p a b i l i t y c = Task.getRepositoryO
.lookup("TasklReadFile");
/ / c a s t i t t o ReadFile, and invoke remote method
byte b = ((ReadFile) c ) . r e a d B y t e O ;
other tasks hold a reference to the capabihty. This prevents tasks from holding
on to garbage in other tasks.
In order to protect the caller's and callee's threads from each other, the
generated stubs provide the illusion of switching threads. Because most virtual
machines map Java threads directly onto kernel threads it is not practical to
actually switch threads: as shown in the next subsection this would slow down
cross-task calls substantially. A fast user-level threads package might solve this
problem, but would require modifications to the virtual machine, and would thus
limit the J-Kernel's portability. The compromise struck in the current implemen-
tation uses a single Java thread for both the caller and callee but prevents direct
access to that thread to avoid security problems.
Conceptually, the J-Kernel divides each Java thread into multiple segments,
one for each side of a cross-task call. The J-Kernel class loader then hides the
system Thread class that manipulates Java threads, and interposes its own with
an identical interface but an implementation that only acts on the local thread
segment. Thread modification methods such as stop and suspend act on thread
segments rather than Java threads, which prevents the caller from modifying
the callee's thread segment and vice-versa. This provides the illusion of thread-
switching cross-task calls, without the overhead for actually switching threads.
The illusion is not totally convincing, however—cross-task calls really do block,
so there is no way for the caller to gracefully back out of one if the callee doesn't
return.
^ Shared classes (and, transitively, the classes that shared classes refer to) are not
allowed to have static fields, to prevent sharing of non-capability objects through
static fields. In addition, to ensure consistency between domains, two domains that
share a class must also share other classes referenced by that class.
381
Ironically, the J-Kernel needs to prevent the sharing of some system classes.
For example, the file system and thread classes present security problems. Others
contain resources that need to be defined on a per-task basis: the class System,
for example, contains static fields holding the standard input/output streams.
In other words, the "one size fits all" approach to class sharing in most Java
security models is simply not adequate, and a more flexible model is essential to
make the J-Kernel safe, extensible, and fast.
In general, the J-Kernel tries to minimize the number of system classes visible
to tasks. Classes that would normally be loaded as system classes (such as classes
containing native code) are usually loaded into a privileged task in the J-Kernel,
and are accessed through cross-task communication, rather than through direct
calls to system classes. For instance, we have developed a task for file system
access that is called using cross-task communication. To keep compatibility with
the standard Java file API, we have also written alternate versions of Java's
standard file classes, which are just stubs that make the necessary cross-task
calls. (This is similar to the interposition proposed by [35]).
The J-Kernel moves functionality out of the system classes and into tasks
for the same reasons that micro-kernels move functionality out of the operating
system kernel. It makes the system as a whole extensible, i.e., it is easy for any
task to provide alternate implementations of most classes that would normally
be system classes (such as file, network, and thread classes). It also means that
each such service can implement its own security policy. In general, it leads to a
cleaner overall system structure, by enforcing a clear separation between differ-
ent modules. Java libraries installed as system classes often have undocumented
and unpredictable dependencies on one another ^. Richard Rashid warned that
the UNIX kernel had "become a 'dumping ground' for every new feature or fa-
cility" [30]; it seems that the Java system classes are becoming a similar dumping
ground.
Null LRMI Table 1 dissects the cost of a null cross-task call (null LRMI)
and compares it to the cost of a regular method invocation, which takes a few
tens of nanoseconds. The J-Kernel null LRMI takes 60x to 180x longer than
^ For instance, Microsoft's implementation of java.io.File depends on
java.io.DatalnputStream, which depends on com.ms.lang.SystemX, which de-
pends on classes in the abstract windowing toolkit. Similarly, java.lang.Object
depends transitively on almost every standard library class in the system.
382
Threads Table 3 shows the cost of switching back and forth between two Java
threads in MS-VM and Sun-VM. The base cost of two context switches between
NT kernel threads {NT-base) is S.Gfis, and Java introduces an additional 1-2/is
of overhead. This confirms that switching Java threads during cross-task calls
would add a significant cost to J-Kernel LRML
mechanism. By making direct copies of the objects and their fields without
using an intermediate Java byte-array, the fast-copy mechanism improves the
performance of LRMI substantially—more than an order of magnitude for large
arguments. The performance difference between the second and third rows (both
copy the same number of bytes) is due to the cost of object allocation and
invocations of the copying routine for every object.
Our J-Server setup uses two off-the-shelf products: Lucent Technologies DEFIN-
ITY Enterprise Communications Server and the Dialogic Dialog/4 voice inter-
face. At the core, the DEFINITY server is a Private Branch Exchange (PBX),
i.e. a telephone switch, augmented by an Ethernet-based network controller. The
network controller in this PBX allows the routing of telephone calls to be moni-
tored and controlled from a PC over the LAN. Lucent provides a Java interface
allowing the J-Server to communicate with the PBX using the Java Telephony
API (JTAPI) [18].
The Dialog/4 Voice Processing Board, from Dialogic Corporation, provides
four half-duplex, analog telephone interfaces and includes a digital signal proces-
sor (DSP) to play and record audio, and to detect and transmit DTMF (dual-
tone multi-frequency) signals. Using two Dialog/4 boards, the J-Server is able to
make eight simultaneous half-duplex voice connections to the telephone network.
385
Each servlet in the J-Server runs as a separate task to isolate it from the J-
Server core as well as from other servlets. This setup allows new servlets to be
introduced into the system or crashed ones to be reloaded without disturbing the
server operation. The J-Server core is also divided into several tasks. One task
dispatches incoming HTTP requests to servlets. Three tasks handle telephony:
the first is responsible for communicating with the PBX via JTAPI, the second
deals with the Dialogic voice processing boards, and the third is in charge of
dispatching telephony events to the appropriate servlets.
During operation, the J-Server performs a large number of cross-task calls to
pass events and data around. This is best illustrated by looking at the handling
of a telephone call by a simple voice-mail servlet. When a telephone call reaches
the PBX, an event is sent to the J-Server via JTAPI and is handled by the
task in charge of PBX events. This task then sends the event information to the
telephony dispatch task, which in turn passes the information to the appropriate
servlet. The servlet instructs the J-Server to route the call to one of the telephone
lines connected to the Dialogic card. At this point, the voice processing task
begins generating events, starting with a "ringing" event. These events are passed
to the dispatching task, and then on to the appropriate servlet. The servlet
proceeds to take the telephone off hook, and to start playing audio data. In
this application, the data is passed from the servlet task to the dispatch task
and then on to the voice task in one large chunk. When the voice channel has
finished playing the audio sample, it alerts the servlet, which, in turn, responds
by creating an appropriate response object. When the calling party hangs up,
the servlet instructs the voice channel to stop recording. The servlet waits for
all recorded data and then saves the sample as a Microsoft WAVE file for later
playback.
4.2 Performance
The second test, VoiceBox Listing, measures the performance of the HTTP
component of J-Server. Using a web browser, a user makes a request to the
servlet, which displays the contents of a voice box directory, formatted as an
HTML page. Cross-task calls are made from the HTTP dispatching task to the
servlet task. In addition, a simple authentication servlet is used to verify the
user's password, resulting in another cross-task call.
Table 5 summarizes the results. The Initiator and Responder columns cor-
respond to two parties involved in HalfDuplex Conversation, while the Listing
column corresponds to the servlet fetching and formatting information about
voice messages. The first five rows present the actual measurement values; the
remaining rows contain derived information.
In all cases the overheads of crossing tasks (which include the cost of copying
arguments and then return values) are below 1% of the total consumed CPU
time. On the average, crossing tasks costs between 3//s-4.6/is and an average
call transfers between 312 (Initiator) to 450 (Listing) bytes. The cost of copying
data accounts for 27%-31% of an average cross-task call. This suggests that
overheads of crossing tasks in real applications when no data transfer is involved
are between 2.1/is-3.3^s. This is in contrast to the cost of a null cross-task call,
measured in an isolated tight loop: 1.35/is.
The achieved bandwidth of copying data across tasks is sufficient for this ap-
plication. About 85% of the data transferred across tasks in the two experiments
is stored in byte arrays of 0.5Kbytes and IKbytes; the remaining data are either
references to capabilities, primitive data types or data inside objects of the type
J a v a . l a n g . S t r i n g . The average copying bandwidth is 330-380Mbytes/s, which
is over 50% of the peak data copying bandwidth of 630Mbytes/s achieved in
Java with a tight loop of calls to System.arraycopyO.
With respect to scalability, our current hardware configuration supports at
most eight simultaneous half-duplex connections. Every connection results in an
387
additional 1.7%-2.3% increase in the CPU load. From this we estimate that,
given enough physical lines, our system could support about 50 simultaneous
half-duplex connections before reaching full CPU utilization. Since the J-Server
demands modest amounts of physical memory, permanent storage and network
resources, CPU is the bottleneck from the perspective of scaling the system,
so the range of 50 simultaneous half-duplex connections is very likely to be
achievable in practice.
While analyzing the performance, it is important to note that Java introduces
non-obvious overheads. For instance, in the Listing experiment, a large part of
the 18.5ms of CPU time can be accounted for as follows. About seven millisec-
onds are spent in the network and file system Java classes and native protocol
stacks. Roughly eight milliseconds are spent performing string concatenations
in one particular part of the VoiceBox Listing servlet (Java string processing is
very slow under MS JVM). These overheads, which are introduced by Java and
not easy to avoid, dwarf the relative costs of crossing task boundaries and lead to
conclusions that may be not true if the JVM performance improves dramatically.
One conclusion, based on the current performance numbers, is that in large
applications similar to J-Server, the task boundaries crossing overheads are
very small relative to the total execution time and further optimizing them
will bring barely noticeable performance improvements. Second, in applications
where crossing tasks is frequent and optimizing cross-task calls actually matters,
on the average 30% of task crossing overheads can be removed by wrapping data
structures and arrays in capabilities. This avoids copying data between tasks but
comes at the expense of increased programming effort. It is important to stress
that the conclusions are drawn from observing several experiments based on a
single, although complex and realistic, application. More study is needed to fully
understand application behavior on top of Java in general and the J-Kernel in
particular.
5 Related Work
The Alta and GVM systems [1] developed at the University of Utah are closely
related to the J-Kernel. The motivation is to implement a process model in
Java, which provides memory protection, control over scheduling, and resource
management. Both systems are implemented by modifying a JVM (the free
Kaffe JVM is used). In GVM each Java process receives separate threads, heap,
and classes. Sharing is only allowed between a process and a special system
heap. This invariant is enforced using a write barrier. Each heap is garbage
collected separately and a simple reference counting mechanism is used for cross-
heap references. Alta implements the nested process model used in the Fluke
microkernel [10]. Child processes of the same parent process can share data
structures directly: the resources used are attributed to the parent. While the
inter-process communication costs of Alta and GVM are higher than in the J-
Kernel (see [1]), the custom JVM modifications allow both GVM and Alta to
take control of thread scheduling and to be more precise in resource management.
388
Several major vendors have proposed extensions to the basic Java sandbox se-
curity model for applets [20, 25, 29]. For instance, Sun's JDK 1.1 added a notion
of authentication, based on code signing, while the JDK 1.2 adds a richer struc-
ture for authorization, including classes that represent permissions and methods
that perform access control checks based on stack introspection [12]. JDK 1.2
"protection domains" are implicitly created based on the origin of the code, and
on its signature. This definition of a protection domain is closer to a user in
Unix, while the J-Kernel's task is more like a process in Unix. Balfanz et al. [2]
define an extension to the JDK which associates domains with users running
particular code, so that a domain becomes more like a process. However, if do-
mains are able to share objects directly, revocation, resource management, and
domain termination still need to be addressed in the JDK.
JDK 1.2 system classes are still lumped into a monolithic "system domain",
but a new classpath facilitates loading local applications with class loaders rather
than as system classes. However, only system classes may be shared between
domains that have different class loaders, which limits the expressiveness of
communication between domains. In contrast, the J-Kernel allows tasks to share
classes without requiring these tasks to use the same class loader. In the future
work section. Gong et al. [12] mentions separating current system classes (such as
file classes) into separate tasks, in accordance with the principle of least privilege.
The J-Kernel already moves facilities for files and networking out of the system
classes and into separate tasks.
A number of related safe-language systems are based on the idea of using
object references as capabilities. Wallach et. al. [35] describe three models of
Java security: type hiding (making use of dynamic class loading to control a
domain's namespace), stack introspection, and capabilities. They recommended
a mix of these three techniques. The E language from Electric Communities [8]
is an extension of Java targeted towards distributed systems. E's security archi-
tecture is capability based; programmers are encouraged to use object references
as the fundamental building block for protection. Odyssey [11] is a system that
supports mobile agents written in Java; agents may share Java objects directly.
Hagimont et al. [14] describe a system to support capabilities defined with spe-
cial IDL files. All three of these systems allow non-capability objects to be passed
directly between domains, and generally correspond to the share anything ap-
proach described in Section 2. They do not address the issues of revocation,
domain termination, thread protection, or resource accounting.
The SPIN project [3] allows safe Modula-3 code to be downloaded into the
operating system kernel to extend the kernel's functionality. SPIN has a par-
ticularly nice model of dynamic linking to control the namespace of different
extensions. Since it uses Modula-3 pointers directly as capabilities, the limita-
tions of the share anything approach apply to it.
Several recent software-based protection techniques do not rely on a partic-
ular high level language like Java or Modula-3. Typed assembly language [26]
pushes type safety down to the assembly language level, so that code written at
the assembly language level can be statically type checked and verified as safe.
389
Software fault isolation [34] inserts run-time "sandboxing" checks into binary
executables to restrict the range of memory that is accessible to the code. With
suitable optimizations, sandboxed code can run nearly as fast as the original
binary on RISC architectures. However, it is not clear how to extend optimized
sandboxing techniques to CISC architectures, and sandboxing cannot enforce
protection at as fine a granularity as a type system. Proof carrying code [27, 28]
generalizes many different approaches to software protection—arbitrary binary
code can be executed as long as it comes with a proof that it is safe. While
this can potentially lead to safety without overhead, generating the proofs for a
language as complex as Java is still a research topic.
The J-Kernel enforces a structure that is similar to traditional capability sys-
tems [22, 23]. Both the J-Kernel and traditional capability systems are founded
on the notion of unforgeable capabilities. In both, capabilities name objects in a
context-independent manner, so that capabilities can be passed from one domain
to another. The main difference is that traditional capability systems used virtual
memory or specialized hardware support to implement capabilities, while the J-
Kernel uses language safety. The use of virtual memory or specialized hardware
led either to slow cross-domain calls, to high hardware costs, or to portability
limitations. Using Java as the basis for the J-Kernel simplifies many of the is-
sues that plagued traditional capability systems. First, unlike systems based on
capability lists, the J-Kernel can store capabilities in data structures, because
capabilities are implemented as Java objects. Second, rights amplification [22]
is implicit in the object-oriented nature of Java: invocations are made on meth-
ods, rather than functions, and methods automatically acquire rights to their
self parameter. In addition, selective class sharing can be used to amplify other
parameters. Although many capability systems did not support revocation, the
idea of using indirection to implement revocation goes back to Redell [31]. The
problems with resource accounting were also on the minds of implementers of
capability systems—Wulf et. al. [36] point out that "No one 'owns' an object in
the Hydra scheme of things; thus it's very hard to know to whom the cost of
maintaining it should be charged".
Single-address operating systems, like Opal [6] and Mungi [17], remove the
address space borders, allowing for cheaper and easy sharing of data between
processes. Opal and Mungi were implemented on architectures offering large
address spaces (64-bit) and used password capabilities as the protection mech-
anism. Password capabilities are protected from forgery by a combination of
encryption and sparsity.
Several research operating systems support very fast inter-process commu-
nication. Recent projects, like L4, Exokernel, and Eros, provide fine-tuned im-
plementations of selected IPC mechanisms, yielding an order of magnitude im-
provement over traditional operating systems. The systems are carefully tuned
and aggressively exploit features of the underlying hardware.
The L4 /i-kernel [15] rigorously aims for minimality and is designed from
scratch, unlike first-generation /i-kernels, which evolved from monolithic OS
kernels. The system was successful at dispelling some common misconceptions
390
about /i-kernel performance limitations. Exokernel [9] shares L4's goal of being
an ultra-fast "minimalist" kernel, but is also concerned with untrusted loadable
modules (similar to the SPIN project). Untrusted code is given efficient control
over hardware resources by separating management from protection. The focus
of the EROS [33] project is to support orthogonal persistence and real-time com-
putations. Despite quite different objectives, all three systems manage to provide
very fast implementations of IPC with comparable performance, as shown in Ta-
ble 6. A short explanation of the 'operation' column is needed. Round-trip IPC
is the time taken for a call transferring one byte from one process to another and
returning to the caller; Exokernel's protected control transfer installs the callee's
processor context and starts execution at a specified location in the callee.
6 Conclusion
The J-Kernel project explores the use of safe language technology to construct
robust protection domains. The advantages of using language-based protection
are portability and good cross-domain performance. The most straightforward
implementation of protection in a safe language environment is to use object ref-
erences directly as capabilities. However, problems of revocation, domain termi-
nation, thread protection, and resource accounting arise when non-shared object
references are not clearly distinguished from shared capabilities. We argue that
a more structured approach is needed to solve these problems: only capabilities
can be shared, and non-capability objects are confined to single domains.
We developed the J-Kernel system, which demonstrates how the issues of
object sharing, class sharing, thread protection, and resource management can
be addressed. As far as we know, the J-Kernel is the first Java-based system
that integrates solutions to these issues into a single, coherent protection system.
Our experience using the J-Kernel to extend the Microsoft IIS web server and
to implement an extensible web and telephony server leads us to believe that a
safe language system can achieve both robustness and high performance.
391
References
1. G. Back, P. Tullmann, L. StoUer, W. C. Hsieh, J. Lepreau. Java Operating Sys-
tems: Design and Implementation. Technical Report UUCS-98-015, Department of
Computer Science, University of Utah, August, 1998.
2. D. Balfanz, and Gong, L. Experience with Secure Multi-Processing in Java. Tech-
nical Report 560-97, Department of Computer Science, Princeton University,
September, 1997.
3. B. Bershad, S. Savage, P. Pardyak, E. Sirer, M. Fiuczynski, D. Becker, S. Eggers,
and C. Chambers. Extensibility, Safety and Performance in the SPIN Operating
System. 15th ACM Symposium on Operating Systems Principles, p.267-284, Cop-
per Mountain, CO, December 1995.
4. B. Bershad, T. Anderson, E. Lazowska, and H. Levy. Lightweight Remote Proce-
dure Call. 12th ACM Symposium on Operating Systems Principles, p. 102-113,
Lichtfield Park, AZ, December 1989.
5. R. S. Boyer, and Y. Yu. Automated proofs of object code for a widely used micro-
processor. J. ACM 43(1), p. 166-192, January 1996.
6. J. Chase, H. Levy, E. Lazowska, and M. Baker-Harvey. Lightweight Shared Ob-
jects in a 64-Bit Operating System. ACM Object-Oriented Programming Systems,
Languages, and AppHcations (OOPSLA), October 1992.
392
7. G. Czajkowski and T. von Eicken. JRes: A Resource Accounting Interface for Java.
To appeax in proceedings of the 1998 Conference on Object-Oriented Programming
Languages, Systems, and Applications.
8. Electric Communities. The E White Paper.
http://www.communities.eom/products/tools/e.
9. R. Engler, M. Kaashoek, and J. James O'Toole. Exokernel: An Operating System
Architecture for Application-Level Resource Management. 15th ACM Symposium
on Operating Systems Principles, p. 251266, Copper Mountain, CO, December
1995.
10. B. Ford, G. Back, G. Benson, J. Lepreau, A. Lin, and O. Shivers. The Fluke OSKit:
A substrate for OS and language research. In Proc. Of the 16th SOSP, pp. 38-51,
St. Malo, Prance, October 1997.
11. General Magic. Odyssey, http://www.genmagic.com/agents.
12. L. Gong, and Schemers, R. Implementing Protection Domains in the Java Devel-
opment Kit 1.2. Internet Society Symposium on Network and Distributed System
Security, San Diego, CA, March 1998.
13. J. Gosling, B. Joy, and G. Steele. The Java language specification. Addison-Wesley,
1996.
14. D. Hagimont, and L. Ismail. A Protection Scheme for Mobile Agents on Java.
3rd Annual ACM/IEEE Int'l Conference on Mobile Computing and Networking,
Budapest, Hungary, September 2630, 1997.
15. H. Haertig, et. al. The Performance of fi-Kernel-Based Systems. 16th ACM Sym-
posium on Operating Systems Principles, p. 6677, Saint-Malo, FYance, October
1997.
16. C. Hawblitzel, C. C. Chang, G. Czajkowski, D. Hu, and T. von Eicken. Imple-
menting Multiple Protection Domains in Java. 1998 USENIX Annual Technical
Conference, p. 259-270, New Orleans, LA, June 1998.
17. G. Heiser, et. al. Implementation and Performance of the Mungi Single-Address-
Space Operating System. Technical Report UNSW-CSE-TR-9704, Univeristy of
New South Wales, Sydney, Austraha, June 1997.
18. JavaSoft. Java Telephony API. http://java.sun.com/products/jtapi/index.html.
19. JavaSoft. Remote Method Invocation Specification, http://java.sun.com.
20. JavaSoft. New Security Model for JDK1.2. http://java.sun.com
21. JavaSoft. Java Servlet API. http://java.sun.com.
22. A. K. Jones and W. A. Wulf Towards the Design of Secure Systems. Software
Practice and Experience, Volume 5, Number 4, p. 321336, 1975.
23. H. M. Levy. Capability-Based Computer Systems. Digital Press, Bedford, Mas-
sachusetts, 1984.
24. J. Liedtke, et. al. Achieved IPC Performance. 6th Workshop on Hot Topics in
Operating Systems, Chatham, MA, May.
25. Microsoft Corporation. Microsoft Security Management Architecture White Paper.
http://www.microsoft.com/ie/ security.
26. G. Morrisett, D. Walker, K. Crary, and N. Glew. Prom System F to Typed Assembly
Language. 25th ACM Symposium on Principles of Programming Languages. San
Diego, CA, January 1998.
27. G. Necula and P. Lee. Safe Kernel Extensions Without Run-Time Checking.
2nd USENIX Symposium on Operating Systems Design and Implementation, p.
229243, Seattle, WA, October 1996.
28. G. Necula. Proof-carrying code. 24th ACM Symposium on Principles of Program-
ming Languages, p. 106119, Paris, 1997.
393
Leendert van Doom, Martin Abadi, Mike Burrows, and Edward Wobber
1 Introduction
Object-oriented communication has become popular in distributed systems [2,
23,19]. With objects or without them, distributed systems typically rely on
networks with no low-level support for security; the vulnerability of distributed
systems is by now evident and worrisome [24,4]. Therefore, a need exists for
secure object-oriented communication.
We describe the design and implementation of secure network objects. Secure
network objects extend Modula-3 network objects [18,2] with security guaran-
tees. When a client invokes a method of a secure network object over the network,
the main security properties are:
- The client must possess an unforgeable object reference.
- The client and the owner of the object can choose to authenticate each other.
- The arguments and results of the method invocation are protected against
tampering and replay, and optionally against eavesdropping.
For high-speed bulk communication, the network objects system supports buff-
ered streams called readers and writers. We make these streams secure also.
Our design accommodates both access control lists (ACLs) [11] and capa-
bilities [6]. It seems natural to treat network object references as capabilities;
moreover, these capabilities can be implemented efficiently. However, capabili-
ties suffer from the well-known confinement problem: it is hard to keep them
sufficiently secret (cf. [10]). The support for ACLs allows implementors to limit
this problem, and to use identity-based security whenever that is appropriate, in
particular for auditing. Systems with both ACLs and capabilities are not new;
we include some comparisons in section 6.
Based on "Secure Network Objects" by Leendert van Doom, Martin Abadi, Mike
Burrows, and Edward Wobber, which appeared in the Proceedings of the IEEE
Symposium on Security & Privacy; Oakland, California, May 1996; 211-221. ©1996
IEEE.
396
The central goal of our work was the integration of security and network
objects. We have obtained the following features:
- Apphcations can use security easily, with minimal code changes.
* Security is mostly encapsulated within the network objects run-time sys-
tem.
* Objects and methods provide convenient units of protection.
* Subtyping expresses security properties quite simply. Secure network ob-
jects are a subtype of regular network objects.
- Through the combination of ACLs and capabilities, the security model is
rich enough to enable applications with sophisticated security requirements.
- The implementation of our design is reasonably straightforward. In this re-
spect, we benefited from the structure of the existing network objects system.
We also borrowed ideas from previous work on authentication [27]: each node
runs an authentication agent that is responsible for managing keys and for
identifying local users to other nodes. We feel that our experience partially
validates those previous efforts.
The next section presents our programming interface. Sections 3 and 4 de-
scribe the two main components of our system: the authentication agent and the
run-time system. Section 5 discusses experience with secure network objects, in-
cluding performance measurements and some example applications. Section 6
discusses related work.
2 Programming Interface
In a world with fast CPUs, no government export controls, and pervasive use
of cryptographic credentials, we would give uniform security guarantees for all
objects, even for those that do not require them. This would make for a simpler
system and a shorter paper.
As a compromise, we define network objects with three levels of security:
(1) no security; (2) authenticity; (3) authenticity and secrecy. We call the last
two kinds secure network objects. A secure invocation is the invocation of a
method of a secure network object.
An important characteristic of our design is that it gives security guarantees
for whole objects rather than for individual methods. This allows us to specify the
properties of secure network objects through subtyping, rather than by inventing
new language features. We have a type of objects with no security, a subtype
with authenticity, and a further subtype that adds secrecy.
These types are explained in the following sections, along with a type to
represent identities. We postpone the discussion of readers and writers to sec-
tion 2.6.
2.2 Authenticity
1. Integrity: The invocation that the server receives is exactly the one issued
by the client. The results that the client receives are exactly the ones issued
by the server as a response to this invocation.
2. At-most-once semantics: The server receives the invocation at most once.
The client receives the response at most once.
3. Confinement: If the invocation or the response contains a secure network
object, an eavesdropper does not learn enough to invoke a method of that
object.
By (1), the server knows that the client has the method name, the object,
and all additional arguments of the invocation; the client knows that the server
has the results. In addition, since an object has a unique server, that same server
responds to all invocations of methods of the object.
By (2), client and server are protected against replays.
398
2.3 Secrecy
We introduce a subtype SecNetObj .T of AuthNetObj .T for secret communica-
tion. When a client invokes a method of an object of type SecNetObj .T, the
following additional guarantee holds:
4. Secrecy: An eavesdropper does not obtain any part of the method name, the
object, the additional arguments, or the results of the invocation.
However, we do not attempt to provide perfect anonymity. An eavesdropper may
recognize that two method invocations are for the same object. As explained in
the next section, an eavesdropper may also learn who is communicating.
2.4 Identity
An identity consists of a user name and a host name. For simplicity, we assume
that each address space is running on behalf of one user on one host, and we
associate with each address space the corresponding identity.
We introduce a type to represent identities:
INTERFACE Ident;
TYPE
T <: OBJECT METHODS
userNameO : TEXT;
hostNameO : TEXT;
END;
PROCEDURE MineO: T;
END Ident.
INTERFACE FileServer;
TYPE
T = AuthNetObj.T OBJECT METHODS
owner0: Ident.T;
openCid: Ident.T; name: TEXT): File;
END;
END FileServer.
By convention this object provides an owner method which returns the identity
of the principal on whose behalf the file server is running. This result is used by
a client to authenticate the file server.
Although f s is secure, anyone can obtain f s and invoke its methods. There-
fore, it is reasonable for the file server to expect an identity as argument of any
method invocation. When a client c issues the call f s . o p e n ( I d e n t . M i n e ( ) ,
" / e t c / m o t d " ) , the open method should check the identity of c. If this check
succeeds, the open method may return another secure network object f, which
represents the open file "/etc/motd". Client c may then invoke f . r e a d ( ) . Be-
cause c and the file server have authenticated one another, c and the owner of
f may choose not to authenticate one another further, even though the owner
of f need not be the file server.
Identity objects have important applications beyond bootstrapping. They
allow a server to confine the use of capabilities to particular clients. Additionally,
identities can be logged to construct an audit trail.
In our example, c could pass its identity to the read method of f, which
could check that c is a particular user, or belongs to a particular group. The
identity check provides protection even if c publishes f. The read method could
also record the names of all callers, for future inspection.
Another important application of the type Ident.T is the implementation
of various authorization mechanisms, and particularly reference monitors with
ACLs. Several designs are possible. For example, a secure network object may
include a method checkACL for making access-control decisions; the arguments
of checkACL are a mode m (such as read or write) and an identity i; the result
is a boolean that indicates whether the user with identity i is allowed access
with mode m to the object. In the intended implementation, checkACL compares
i with names kept in fists (that is, in ACLs). Additional methods enable the
modification of the ACLs. When group-membership checks are needed, checkACL
can consult group registries across the network using secure invocations.
401
Modula-3 includes buffered streams, called readers and writers. For example, a
reader might be the stream of data from a file or from a terminal. The network
objects system allows readers and writers to be passed between address spaces,
as follows: an address space creates a reader or writer, and passes it to one other
address space, which reads from it or writes to it, but never passes it. The two
address spaces can then use the reader or writer to transmit data directly on their
underlying network connection, without the overhead of method invocations.
Readers and writers are treated specially when passed in secure invocations:
2.7 An Example
Let us consider a trivial terminal server that offers shells for remote users. Be-
fore a user gets a shell, the server obtains the user's identity, both to verify that
the user is legitimate and to associate the identity with the shell. On the other
hand, the user obtains the server's identity, and can check that the server is
the expected one and not an impostor. Once the user has the shell, the user's
commands and their results are protected against tampering, replay, and eaves-
dropping.
The terminal server exports the following interface:
INTERFACE STS;
TYPE
T = SecNetObj.T OBJECT METHODS
owner0: Ident.T;
create_shell(id: Ident.T; rd: Rd.T; wr: Wr.T);
END
END STS.
402
VAR
agent: NetObj.Address;
s t s : STS.T;
server_ident: Ident.T;
BEGIN
agent := NetObj.Locate(serverHost);
sts := NetObj.Import("STS", agent);
server_ident := sts.ownerO;
IF server_ident.userNameO = "root"
AND server_ident.hostNameO = serverHost
THEN
sts.create_shell(Ident.Mine 0 ,
Stdio.stdin, Stdio.stdout);
ELSE
(* the server is an impostor *)
END;
END Client.
The client program imports an object s t s of type STS. T and verifies the iden-
tity s e r v e r . i d e n t of its owner. If this check succeeds, the client program starts
a shell by calling s t s . create_shell. Since STS.T is a subtype of SecNetObj .T,
the shell's standard input and output benefit from the guarantees associated
with SecNetObj .T.
3 Authentication Agents
So far we have focused on the design of a programming interface for security; in
the remainder of the paper we describe our implementation for this programming
interface. Our system has two main components, the authentication agent and
the run-time system. We describe the first in this section and the second in the
next.
In our system, each node runs an authentication agent. An authentication
agent is a process that assists application address spaces for the purposes of
403
security. It communicates with its local clients only via local secure channels.
In our implementation, the authentication agent is a user-level process; as local
secure channels we have used Unix domain sockets in one version of our im-
plementation and System V streams in another. Each agent is responsible for
managing identities and keys, as follows.
When an address space receives an identity object, it can ask the local agent
for the corresponding user name and host name. The agent answers this question
by communicating with its peers; each agent knows the name of its host and the
user names associated with its clients.
The agent also provides channel keys. A channel key is an encryption key
shared by two address spaces. The agent performs key exchange with its peers
in order to generate these channel keys for its clients.
When the agent negotiates a new channel key, it also negotiates an expiry
time for the key and a key identifier. A key identifier allows the sender of a
message to tell the receiver which key was used for constructing the message.
The key identifier can be transmitted in clear as part of the message, while
the key itself should not be. The agent picks key identifiers so that they are
unambiguous.
Our design encapsulates all of the key exchange machinery in the authenti-
cation agent. We have tried two implementations of key exchange, one based on
our own protocol and another that relies on Sun's secure RPC authentication
service. The change of implementation was transparent to user address spaces.
In the second implementation, we wrote 1400 lines of C code for the agent.
We took the idea of using an authentication agent from the work of Wobber et
al. [27]. The authentication agent described in that work is more elaborate than
ours; for example, it deals with delegation and supports channel multiplexing.
These features could be incorporated in our system, though they may preclude
the use of off-the-shelf authentication software.
4.1 Capabilities
The wire representation of an insecure network object is a pair (s, objid), where:
404
The tuple (s, objid, capid, key, exp) is a capability. The capability is created by the
owner of the object. The key is the secret that is shared between the holders of the
capability and the owner of the object. The key may never appear in clear on the
network: it is encrypted using a channel key when transmitted between address
spaces in secure invocations. The components s, objid, and capid determine key
uniquely, and thus serve as key identifier.
The capability becomes invalid once its expiration time has passed. The use
of an expiration time can be beneficial in revoking a capability; it also limits
the use of the key. The client run-time system refreshes the capabilities it holds
before they expire. This is transparent to the client application. Each secure
network object has a hidden method that the client run-time system can call to
obtain a fresh capability for the object.
In general, more than one capability may be in use for any given object. The
owner of an object maintains a set of valid capabilities for the object, and is
careful not to give out capabilities that are about to expire.
4.2 Protocol
If a client c holds a network object reference (s, objid) for an object of type
Net Ob j .T, an invocation of a method of the object consists of a request from c
and a reply from the owner s. The request is (Request: c, s, objid, reqdata), where
reqdata includes the method name and arguments of the invocation. The reply
is (Reply: c, s, repdata), where repdata contains the results of the invocation.
If a client c holds a capability (s, objid, capid, key, exp) for an object of type
AuthNetObj .T, both the request and the reply are modified. We assume that the
local authentication agents for c and s have authenticated one another, and that
they make available a channel key chankey for communication between c and s.
This key will be used for signing the request and the reply, and sometimes for
encryption, as follows.
The request has two parts, a body and a signature. The body, reqbody, is:
where mid is a message identifier that c has never before attached to a request
signed using chankey. The signature is:
may have the same key.) Both ai and a-z know the secure channel identifier
and keep track of the key and the sequence number associated with the secure
channel. With each method invocation from ai to a-z that uses a secure channel,
both ai and a2 increment the corresponding sequence number. When the key for
a secure channel is half way to its expiration, ai asks its authentication agent
for a new key; on seeing a new key identifier, 02 obtains the new key from its
own agent.
In the normal case, a secure invocation from ai to a2 proceeds as follows.
First ai chooses a secure channel from ai to 02 on which there is no outstanding
invocation (and sets up a new secure channel if none is available). Then ai
constructs the request using the channel key for the secure channel; mid is the
concatenation of the sequence number and the secure channel identifier. When 02
receives the request, it compares the sequence number in mid against its version
of the sequence number. The reply uses the same key and message identifier as
the request. When ai receives the reply, it verifies that the key is still current
and checks m,id.
In a more general scheme, the request and the reply may use different keys
and message identifiers. This generality has some advantages; for example, it
allows an invocation that returns after a long wait to use a fresh key. The changes
in the definition of message identifiers are considerable, so we do not discuss them
here.
a throughput of about 870 KBytes/sec; this implies that nearly 90% of the CPU
is dedicated to DES computation.
As these measurements demonstrate, the performance of our system is ac-
ceptable for many applications. However, we believe that there remains much
room for optimization.
To date, we have had two preliminary but encouraging experiences in the appli-
cation of secure network objects.
In the first application, we have implemented a secure version of an answering
service written by Rob DeLine. This service is part of a telecollaboration system
and implements an answering machine for multi-media messages. The answering
machine is a network object with methods c r e a t e , r e t r i e v e , and d e l e t e . When
a message is created, it is stored under a special name, called a cookie; the cookie
is e-mailed to the intended recipient of the message. An e-mail reader can then
present the cookie to retrieve or to delete the message.
The original answering service is not secure. In particular, cookies are essen-
tially used as capabilities, yet they are communicated in clear and are easy to
guess. Therefore, the service has neither authenticity nor secrecy.
Our version of the answering service addresses these problems. Because of
the performance limitations of software encryption, we have not provided se-
crecy but only authenticity. In our version, the answering machine is a network
object of type AuthNetObj .T; it would be easy to make it a network object of
type SecNetObj . T instead. We store the names of both the sender and intended
recipient with each message; appropriate identity objects must be passed as ar-
guments of calls to the methods c r e a t e , r e t r i e v e , and d e l e t e . For example,
only the sender and intended recipient of a message can delete the message.
Our second application is a secure version of Obliq [3]. Roughly, Obliq is to
Modula-3 as Tel is to C. ObUq is an untyped, interpreted scripting language
that supports distributed object-oriented computation. For example, it can be
used to program computing agents that roam over a network.
In our version of Obliq, each object is implemented as a network object of type
AuthNetObj .T. When an Obhq object o is exported, it is explicitly tagged with a
list of the principals that may import it. The Obliq run-time system encapsulates
o in a reference monitor of type AuthNetObj . T. This reference monitor provides
two methods: a method for access to o and a method that returns the identity
of the owner of the reference monitor. The former method is responsible for
performing access control checks. The latter method allows a client that imports
o to verify that the server is the expected one.
We have implemented the secure version of Obliq that we have just described.
This implementation provides evidence that we can build useful security mech-
anisms for Obliq fairly easily. On the other hand, we do not yet have sufficient
experience to decide which security mechanism is most appropriate.
410
6 Related Work
7 Conclusions
Secure network objects behave like insecure network objects, but preserve the
intended semantics even in the presence of an active attacker; optionally, secure
network objects provide secrecy from eavesdroppers. In addition, identity ob-
jects form the basis for authentication in our system. The combination of secure
network objects with identity objects leads to a simple programming model and
a simple implementation, both in the spirit of the original network objects. Thus
we have integrated security and network objects.
Overall, we felt that object-orientation was helpful. Not surprisingly, we
found that objects give rise to natural units of protection. We also took ad-
vantage of the object type system to specify security properties, directly and
economically.
In our description, we have tried to be precise, but not formal. We believe that
a more formal study would be interesting. In particular, it would be worthwhile
to give notations and rules for reasoning about secure network objects.
411
Acknowledgments
We would like to t h a n k Andrew Birrell, Greg Nelson, and Luca Cardelli for many
helpful discussions; and Roger Needham, Cynthia Hibbard, T i m M a n n , Rustan
Leino, Andrew Tanenbaum, Philip Homburg, Raoul Bhoedjang, Tim Ruhl, and
Greg Sharp for helpful comments on the paper.
References
1. Jean Bacon, Richaid Hayton, Sai Lai Lo, and Ken Moody. Extensible access control
for a hierarchy of servers. ACM Operating Systems Review, 28(3):4-15, July 1994.
2. Andrew Birrell, Greg Nelson, Susan Owicki, and Edward Wobber. Network objects.
Software Practice and Experience, S4(25):87-130, December 1995.
3. Luca Cardelli. A language with distributed scope. Computing Systems, 8{l):27-59,
January 1995.
4. W.R. Cheswick. An evening with Berferd, in which a hacker is lured, endured, and
studied. In Proceedings of the Usenix Winter '92 Conference, 1992.
5. R.H. Deng, S.K. Bhonsle, W. Wang, and A.A. Lazar. Integrating security in
CORBA based object architectures. In Proceedings of the 1995 IEEE Symposium
on Security and Privacy, pages 50-61, May 1995.
6. J.B. Dennis and E.G. van Horn. Programming semantics for multiprogrammed
computation. Communications of the ACM, 9(3):143-155, March 1966.
7. Li Gong. A secure identity-based capability system. In Proceedings of the 1989
IEEE Symposium on Security and Privacy, pages 56-63, May 1989.
8. Graham Hamilton. Personal communication, 1994 and 1996.
9. Paul Ashley Karger. Improving Security and Performance for Capability Systems.
PhD thesis, Cambridge University, October 1988.
10. Butler Lampson. A note on the confinement problem. Communications of the
ACM, 16(10):613-615, October 1973.
11. Butler Lampson. Protection. ACM Operating Systems Review, 1(8): 18-24, January
1974.
12. Butler Lampson, Martin Abadi, Mike Burrows, and Edward Wobber. Authentica-
tion in distributed systems: Theory and practice. ACM Transactions on Computer
Systems, 10(4):265-310, November 1992.
13. J. Mitchell, J. Gibbons, G. Hamilton, P. Kessler, Y. Khahdi, P. Kougiouris,
P. Madany, M. Nelson, M. Powell, and S. Radia. An overview of the Spring system.
In IEEE Compcon Spring 1994, February 1994.
14. R. Molva, G. Tsudik, E. van Herreweghen, and S. Zatti. Kryptoknight authenti-
cation and key distribution system. In Proceedings of the European Symposium on
Research m Computer Security, November 1992.
15. Sape J. Mullender, Andrew S. Tanenbaum, and Robbert van Renesse. Using sparse
capabilities in a distributed operating system. In Proceedings of the 6th IEEE
conference on Distributed Computing Systems, June 1986.
16. National Bureau of Standards. Data encryption standard. PIPS 47, 1977.
17. Roger Needham. Names. In Sape Mullender, editor. Distributed Systems, chap-
ter 12, pages 315-327. Addison-Wesley, second edition, 1993.
18. Greg Nelson, editor. Systems Programming with Modula-3. Prentice Hall, 1991.
19. Object Management Group. Common object request broker architecture and spec-
ification. OMG Document number 91.12.1.
412
1 Introduction
The integration of mobile code with web browsing creates an access-control
dilemma. On one hand, it creates a social expectation that mobile code should
be as easy to download and execute as fetching and viewing a web page. On the
other hand, the popularity and ubiquity of mobile code increases the likelihood
that malicious programs will mingle with benign ones.
To reassure users about the safety of their data and to keep the user interface
simple and non-intrusive, systems supporting mobile code have chosen to err on
the side of conservatism and simplicity. Depending on its source, mobile code is
partitioned into trusted and untrusted code. Code is considered trusted if it is
loaded from disk [9,12] or if it is signed by an author/organization deemed trust-
worthy by the user [12,30]. Untrusted code is confined to a severely restricted
execution environment [9] (eg, it cannot open local files or sockets, cannot cre-
ate a subprocess, cannot initiate print requests etc); trusted code is either given
access to all available resources [30] or is given selective access based on user-
specified access-control hsts [12].
For the programs considered untrusted, these mechanisms can be overly re-
strictive. Many useful and safe programs, such as a well-behaved editor applet
from a lesser-known software company, cannot be used since it cannot open local
files. In addition, to implement new resource-sharing models such as global com-
puting [6] all communication has to be routed through brokers. This significantly
* This paper is a reprint of a paper that appeared in the Fifth ACM Conference on
Computer and Communications Security (November 3-5, 1998.)
414
limits the set of problems that can be efficiently handled by such models. For
programs considered trusted, these models can be too lax. Errors, not just malice
aforethought, can wipe out or leak important data. Combined with a suitable
audit trail, signed programs [12] do provide the ability to take legal recourse if
need be.
In this chapter, we present a history-based access-control mechanism that is
suitable for mediating accesses from mobile code. The key idea behind history-
based access-control is to maintain a selective history of the access requests made
by individual programs and to use this history to improve the differentiation be-
tween safe and potentially dangerous requests. What a program is allowed to do
depends on its own identity and behavior in addition to currently used discrimi-
nators like the location it was loaded from or the identity of its author/provider.
History-based access-control has the potential to significantly expand the set of
programs that can be executed without compromising security or ease of use.
For example, consider an access-control policy that allows a program to open
local files for reading as long as it has not opened a socket and allows it to open
a socket as long as it has not opened a local file for reading. Irrespective of the
source of the program, such a policy can ensure that no disk-resident data will
be leaked. Strictly speaking, this is true iff it is possible to intercept all access
requests being made on behalf of the program - the requests made by itself as
well as the requests made on its behalf. The technique we present is able to
intercept all requests.
We first present some examples of history-based access-control policies. Next,
we discuss issues that have to be resolved for implementing history-based access-
control mechanisms. In section 3, we describe Deeds,^ an implementation of
history-based access-control for Java programs. Access-control policies for Deeds
are written in Java, and can be installed, removed or modified while the programs
whose accesses are being mediated are still executing. Deeds requires policies to
adhere to several constraints. These constraints are checked either at compile-
time by the Java compiler or at runtime by the Deeds policy manager. We
illustrate the operation of the Deeds user interface using snapshots. In section 4.4,
we examine the additional overhead imposed by Deeds using micro-benchmarks
as well as real programs. History-based access-control is not specific to Java or
to mobile code. It can be used for any system that allows interposition of code
between untrusted programs and protected resources. In section 5, we discuss
how a system similar to Deeds can be used to mediate accesses to OS resources
from native binaries. We conclude with a description of related work and the
directions in which we plan to extend this effort.
2 Examples
One-out-of-k: Consider the situation when you want to allow only those pro-
grams that fall into well-marked equivalence classes based on their functionality
Your deeds determine your destiny :)
415
and behavior. For example, you want to allow only programs that provide just
the functionality of a browser or an editor or a shell. A browser can connect to
remote sites, create temporary local files in a user-specified directory, read files
that it has created and display them to the user. An editor can create local files
in user-specified directories, read/modify files that it has created, and interact
with the user. It is not allowed to open sockets. A shell can interact with the
user and can create sub-processes. It cannot open local files, or connect to remote
sites. This restriction can be enforced by a history-based access-control policy
that:
Keeping out rogues: Consider the situation where you want to ensure that
a program that you once killed due to inappropriate behavior is not allowed to
execute on your machine. This restriction can be enforced, to some extent, by
a history-based access-control policy that keeps track of previous termination
events and the identity of the programs that were terminated.
Frustrating peepers: Consider the situation where you want to allow a pro-
gram to access only one of two relations in a database but not both. One might
wish to do this if accessing both the relations may allow a program to extract
information that it cannot get from a single relation. For example, one might
wish to allow programs to access either a relation that contains the date and the
name of medical procedures performed in a hospital or a relation that contains
the names of patients and the date they last came in. Individually, these rela-
tions do not allow a program to deduce information about treatment histories of
individual patients. If, however, a program could access both relations, it could
combine the relations to acquire (partial) information about treatment histories
for individual patients. This example can be seen as an instance of the Chinese
Wall Policy [4]. To block the possibility of a hostile site being able to deduce
the same information from data provided by two different programs it provides,
programs that have opened a socket are, thereafter, not allowed to access sensi-
416
tive relations and programs that have accessed one of the sensitive relations are,
thereafter, not allowed to open sockets.
Slowing down hogs: Consider the situation where you want to limit the rate
at which a program connects to its home site. One might wish to do this, for
example, to eliminate a form of denial of service where a program repeatedly
connects to its home site without doing anything else. This can be enforced by
a history-based access-control policy that keeps track of the timestamp of the
last request. It allows only those requests that occur after a threshold period.
4.1 Architecture
— a handler for socket-creation that records if a socket was ever created by this
program. It rejects the request if a file has been opened by this program (for
reading or writing).
— a handler for file-creation that associates a creator with each file created by a
downloaded program. If the file is to be created in a directory that is included
in a list of user-specified directories, it allows the request to proceed. Else,
it rejects the request.
419
— a handler for open-file-for-read that records if a file was ever opened for
reading by this program. It rejects the request if a socket has been created
by this program.
— a handler for open-file-for-modification that records if a file was ever opened
for writing by this program. It rejects the request if a socket has been created
by this program or if the file in question was not created by this program.
Deeds allows multiple access-control policies to be simultaneously active. Poli-
cies can be installed, removed, or modified during execution. A policy is added
by attaching its constituent handlers to the corresponding events. For example,
the editor policy would be added by attaching its handlers respectively to the
socket-creation event, the file-creation event, the open-file-for-read event and the
open-file-for-modification event. Policies can be removed in an analogous manner
by detaching the constituents handlers from the associated events.
Deeds allows policies to be parameterized. For example, a policy that controls
file creation can be parameterized by the directory within which file creation is
allowed. Policies that are already installed can be modified by changing their
parameters. This allows users to make on-the-fly changes to the environment
within which mobile code executes.
Deeds provides a fail-safe default [32] for every security event. Unless overrid-
den, the default handler for an event disallows all requests associated with that
event from downloaded programs. The default handler can only be overridden
by explicit user request - either by a dialog box or by a profile file containing a
list of user preferences.
4.2 Implementation
In this subsection, we describe the implementation of Deeds. We focus on imple-
mentation of program identity, events, event-histories, policies (including con-
ventions for writing them), and policy management.
Fig. 1. Example of an event manager class. Managers for other events would share the
same structure but would replace checkRead by name the of the particular event. Some
administrative details have been left out of the example; these details axe common to
all event managers.
as a Java class that extends the AccessPolicy class shown in Figure 2. Handlers
are implemented as methods of this class and the event-history is implemented
as variables of this class. For example, a handler for the open-file-for-reading
event could check if a socket has yet been created by the program. If so, it could
raise a GeneralSecurityException; else, it could set a boolean to indicate that
a file has been opened for reading and return.
When a security event occurs (e.g., when checkRead is called), control is
transferred to the Deeds Security Manager which determines the class-loader for
the program that caused the event using the currentClassLoaderO method
provided by the Java Security Manager. This method returns the class-loader
corresponding to the most recent occurrence on the stack of a method from
a class loaded using a class-loader. Since a new instance of the class-loader is
created for every downloaded program and since this instance loads all non-
system-library classes for the program, currentClassLoaderO always returns
the same class-loader every time it is called during the execution of a program.
This technique safely determines the identity of the program that caused the
security event.
422
Fig. 2. Skeleton of the AccessPolicy class. The synchronized keyword ensures that at
most one handler is updating the event-history at any given time.
or ' ' * . c h e c k R e a d ' ' . The former expression specifies only the checkRead
event defined in the FilelO package whereas the latter specifies all check-
Read events irrespective of the package they have been defined in. This spec-
ification is needed as Java's hierarchical namespace allows multiple methods
with the same name to exist in different regions of the namespace. Since a
security event is implemented by a method in a subclass of EventManager
and since every package can have its own security events, the possibility of
name clashes is real. For example, a library to perform file I/O, and a li-
brary to interact with a database could both wish to create a checkRead
event. Since packages are independently developed, extensible systems, such
as Deeds, cannot assume uniqueness of event names.
— Each policy must be accompanied by its source code and the name of the file
containing the source code should be available as a member of the class im-
plementing the policy. We believe that availability of source code of a policy
is important to instill confidence in its operation and its documentation.
Policy manager The Deeds policy manager makes extensive use of the Java re-
flection mechanism [17]. This mechanism allows Java code to inspect and browse
the structure of other classes. The policy manager uses reflection to: (1) iden-
tify methods that are to be used as handlers (they are declared p u b l i c void
and throw the GeneralSecurityException); (2) identify parameters and their
types; (3) initialize and update parameters; and (4) extract specification of the
events that the handlers are to be attached to. In addition, it performs sev-
eral administrative checks such as ensuring that all policy instances have unique
names.
The policy manager is also responsible for ensuring that policies are per-
sistent. It achieves this by storing the parameters for each policy instance on
stable storage and using them to re-install the policy when the environment is
reinitialized (on startup). It also periodically saves the event-history on stable
storage.^
The Deeds user interface comes up only on user request and is used for infre-
quent operations such as browsing/loading/installing policies. In this section, we
describe the functionality of the user interface and present snapshots.
Browsing/viewing/loading policies: The Deeds user interface allows users
to browse the set of available policies, to view documentation and source code
for these policies and to create instances of individual policies. Note that every
policy is required to have documentation (via the documentationO method)
and access to its own source code (via the srcFileName member). In addition,
every parameter has associated documentation which can be viewed. To load a
Note that individual policies are free to save the event-history as frequently as they
wish.
424
Losided Policies
Close Load Policy iBrowser
i s i i B l l Script
Save Settings Unload Policy
iFBI^clcdoor
View/Modify loaded PoMcyJLflaitage Controi
Install Policy |
Instatted Policies
View/Modify Installed Policy [editor
Uninstall Policy |
Number of handlers 0 1 2 3 4 5 6 7 8 9 10
Percent overhead 0.7 1.8 2.6 2.4 2.9 3.5 4.2 3.9 4.3 4.1 4.3
Table 1. Overhead of Deeds security event handlers. The overhead was measured using
a microbenchmaxk which repeatedly opened and closed files. Each handler was identical
(but distinct) and implemented the editor policy.
to the time it takes to load just the first file using existing class-loaders. In both
cases, all the files were local and were in the operating-system file-cache. For this
experiment, we selected seven complete Java applications available on the web.
The applications we used were: (1) news-server, the Spaniel News Server [38]
which manages and serves newsgroups local to an organization; (2) j l e x , the
JLex [23] lexical analyzer; (3) dbase, the Jeevan [22] platform-independent,
object-oriented database; (4) jawavedit, the JaWavedit audio file editor [21]
with multi-lingual voice synthesis, signal processing, and a graphical user inter-
face; (5) obf u s c a t o r , the Hashjava [14] obfuscator for Java class files; (6) javacc,
the JavaCC [20] parser generator; and (7) e d i t o r , the WingDis editor [42].
Table 2 presents results for the latency experiments. As expected, the addi-
tional startup latency increases with the number of files as well as the total size
of the program. Note this does not represent an increase in end-to-end execution
time. Existing class-loaders already parse the bytecodes of class files as a part
of the Java verification process; signed applets require computation of a simi-
lar hash function. Instead, the increase in startup latency is caused by moving
the processing for all the class files before the execution begins. We expect that,
once downloaded, programs of this size and these types (lexer/parser generators,
editors, news server, database etc) will be reused several times. In that case, the
program can be cached as a whole (instead of individual files) and the additional
startup latency has to be incurred only once.
5 Discussion
6 R e l a t e d work
Deeds is currently operational and can be used for stand-alone Java programs.
We are in the process of identifying a variety of useful patterns of behaviors
and evaluating the performance and usability of Deeds in the context of these
behaviors.
In the near term, we plan to develop a history-based mechanism for me-
diating access to OS resources from native binaries. We also plan to explore
the possibility of using program labels to indicate pre-classified behaviors and
automatic loading/unloading of access-control policies to support this.
In the longer term, we plan to explore just-in-time binary rewriting to insert
event generation and dispatching code into downloaded programs. This would
allow users to create new kinds of events as and when they desire. Currently,
new kinds of events are created only by system libraries.
Acknowledgments
We would like to thank anonymous referees for their insightful comments which
helped improve the presentation of this paper.
References
1. A. Alexandrov, M. Ibel, K. Schauser, and C. Scheiman. Extending the operating
system at the user level: the Ufo global file system. In Proceedings of the 1997
USENIX Annual Technical Conference, 1997.
2. B. Bershad, S. Savage, P. Pardyak, et al. Extensibility, safety and performance
in the spin operating system. In Proc of the 15th ACM Symposium on Operating
System Principles, pages 267-84, 1995.
430
24. M. Jones. Interposition agents: Transparently interposing user code at the sys-
tem interface. In Proceedings of the 14th ACM Symposium on Operating System
Principles, 1993.
25. P. Karger. Limiting the damage potential of the discretionary trojan horse. In
Proceedings of the 1987 IEEE Syposium on Research in Security and Privacy, 1987.
26. M. King. Identifying and controlling undesirable program behaviors. In Proceedings
of the 14th National Computer Security Conference, 1992.
27. C. Ko, G. Fink, and K. Levitt. Automated detection of vulnerabilities in privileged
programs by execution monitoring. In Proceedings. 10th Annual Computer Security
Applications Conference, pages 134-44, 1994.
28. N. Lai and T. Gray. Strengthening discretionary access controls to inhibit tro-
jan horses and computer viruses. In Proceedings of the 1988 USENIX Summer
Symposium, 1988.
29. N. Mehta and K. SoUins. Extending and expanding the security features of Java.
In Proceedings of the 1998 USENIX Security Symposium, 1998.
30. Microsoft Corporation. Proposal for Authenticating Code Via the Internet, Apr
1996. http://www.microsoft.com/intdev/security/authcode.
31. R. Rivest. The MD5 message-digest algorithm. RFC 1321, Network Working
Group, 1992.
32. J. Saltzer and M. Schroeder. The protection of information in computer systems.
Proceedings of the IEEE, 63(9):1278-1308, Sep 1975.
33. R. Scheifler and J. Gettys. X Window System : The Complete Reference to Xlib,
X Protocol, Icccm, Xlfd. Butterworth-Heinemann, 1992.
34. F. Schneider. Enforceable security policies. Technical report, Dept of Computer
Science, Cornell University, 1998.
35. C. Serban and B. McMillin. Run-time security evaluation (RTSE) for distributed
applications. In Proc. of the 1996 IEEE Symposium on Security and Privacy, pages
222-32, 1996.
36. Secure hash standard. Federal Information Processing Standards Publication,
FIPS, PUB 180-1, April 1995.
37. R. Simon and M. Zurko. Separation of duty in role-based environments. In Pro-
ceedings of the IEEE Computer Security Foundations Workshop '97, 1997.
38. The Spaniel News Server. Available from Spaniel Software^.
39. V. Varadharajan and P. Allen. Joint actions based authorization schemes. Oper-
ating Systems Review, 30(3):32-45, 1996.
40. D. Wallach, D. Balfanz, D. Dean, and E. Felten. Extensible security architecture
for Java. In SOSP 16, 1997.
41. D. Wichers, D. Cook, R. Olsson, J. Crossley, P. Kerchen, K. Levitt, and R. Lo.
PACL's: an access control list approach to anti-viral security. In USENIX Work-
shop Proceedings. UNIX SECURITY II, pages 71-82, 1990.
42. The WingDis Editor. Available from WingSoft Corporation, P.O.Box 7554, Fre-
mont, CA 94537^°.
9
http://www.searchspaniel.com/newsserver.html
10
http://www.wingsoft.com/javaeditor.shtml
Security in Active Networks
Abstract. The desire for flexible networking services has given rise to
the concept of "active networks." Active networks provide a general
framework for designing and implementing network-embedded services,
typically by means of a programmable network infrastructure. A pro-
grammable network infrastructure creates significant new challenges for
securing the network infrastructure.
This paper begins with an overview of active networking. It then moves
to security issues, beginning with a threat model for active networking,
moving through an enumeration of the challenges for system designers,
and ending with a survey of approaches for meeting those challenges.
The Secure Active Networking Environment (SANE) realizes many of
these approaches; an implementation exists and provides acceptable per-
formance for even the most aggressive active networking proposals such
as active packets (sometimes called "capsules").
We close the paper with a discussion of open problems and an attempt
to prioritize them.
1 W h a t is Active Networking?
As the role of active networking elements is to store, compute and forward, the
managed resources are those required to store packets, operate on them, and
forward them to other elements. The resources provided to various principals
at any instant cannot exceed the real resources {e.g., output port bandwidth)
available at that instant. This emphasis on real resources and time implies that
a conventional < object, principal, access> 3-tuple for an access control list (ACL)
is inadequate.
To provide controlled access to real resources, with real time constraints, a
fourth element to represent duration (either absolute or periodic) must be added,
giving <object, principal, access, QoS guarantees>. This remains an ACL, but is
not "virtualized" by leaving time unspecified and making "eventual" access ac-
ceptable. We should point out that this new element in the ACL can be encoded
as part of the access field. Similarly, we need not use an actual ACL, but we may
use mechanisms that can be expressed in terms of ACLS and are better-suited
for distributed systems.
435
2 Terminology
The term trust is used heavily in computer security. Unfortunately, the term
has several definitions depending on who uses it and how the term is used. In
fact, the U.S. Department of Defense's Orange Book [20], which defined several
levels of security a computer host could provide, defines trust ambiguously. The
definition of trust used herein is a slight modification of that by Neumann [46].
An object is defined as trusted when the object operates as expected according to
design and policy. A stronger trust statement is when an object is trustworthy. A
trustworthy object is one that has been shown in some convincing manner, e.g.,
a formal code-review or formal mathematical analysis, to operate as expected. A
security-critical object is one which the security — defined by a policy — of the
system depends on the proper operation of the object. A security-critical object
can be considered trusted, which is usually the case in most secure systems, but
unfortunately this leads to an unnecessary profusion of such objects.
We note the distinction between trust and integrity: Trust is determined
through the verification of components and the dependencies among them. In-
tegrity demonstrates that components have not been modified. Thus integrity
checking in a trustworthy system is about preserving an established trust or
trust relationship.
to a 10% degradation, or fail to operate (since it could not invoke a new process)
when it hit the table space limitation. Fortunately, a number of new operating
systems [40, 35] have appeared which provide the services necessary to contain
one or more executing threads within a single scheduling domain.
#!/bin/sh
$0 #invoke ourselves
It is our belief that, as in this list, security is often left until last in the
design process, which results in not enough attention and emphasis being given
to security. If security is designed in, it can simply be made part of the design
space in which we search for attractive cost/performance tradeoffs. For example,
if acceptable flexibility requires downloadable software, and acceptable security
means that only trusted downloadable software will be loaded, our cost and
performance optimizations will reflect ideas such as minimizing dynamic checks
with static pre-checks or other means. If security is not an issue, there is no
point in doing this.
The designer's major challenge is finding a point (or set of points) in the
design space which is acceptable to a large enough market segment to influence
the community of users. Sometimes this is not possible; the commercial empha-
sis on forwarding performance is so overwhelming that concessions to security
slowing the transport plane are simply unacceptable. Fortunately, organizations
have become sufficiently dependent on information networks that security does
sell.
438
In the context of active networks, the major focus of security is the set of
activities which provide flexibihty; that is, the faciUty to inject new code "on-
the-fly" into network elements. To build a secure infrastructure, first, the in-
frastructure itself (the "checker") must be unaltered. Second, the infrastructure
must provide assurance that loaded modules (the dynamic checking) will not
violate the security properties. In general, this is very hard. Some means cur-
rently under investigation include domain-specific languages which are easy to
check {e.g., PLAN), proof-carrying code [45, 44], restricted interfaces (ALIEN),
and distributed responsibility (SANE). Currently, the most attractive point in
the design space appears to be a restricted domain-specific language coupled to
an extension system with heavyweight checks. In this way, the frequent (per-
packet) dynamic checks are inexpensive, while focusing expensive scrutiny on
the extension process. This idea is manifest in the Switch Ware active network
architecture [2].
into the system. These services may require authentication and authorization
before allowing access to the resources they protect.
The Safetynet Project [63] at the University of Sussex has also designed a new
language for active networking. They have explicitly enumerated what they feel
are the important requirements for an active networking language and then set
about designing a language to meet those requirements. In particular, they differ
from PLAN in that they hope to use the type system to allow safe accumulation
of state. They appear to be trying to avoid having any service layer at all.
Java [23] and ML [39, 34] (and the MMM [37] project) provide security
through language mechanisms. More recent versions of Java provide protection
domains [22]. Protection domains were first introduced in Multics [55, 56, 38, 54].
These solutions are not applicable to programs written in other languages (as
may be the case with a heterogeneous active network with multiple execution
environments), and are better suited for the applet model of execution than
active networks. The need for a separate bytecode verifier is also considered
by some a disadvantage, as it forces expensive (in the case of Java, at lea^t)
language-compliance checks prior to execution. In this area, there is some re-
search in enhancing the understanding of the tradeoffs between compilation
time/complexity, and bytecode size, verification time, and complexity.
It should be noted that language mechanisms can (and sometimes do) serve as
the basis of security of an active network node. Other language-based protection
schemes can be found in [9, 13, 26, 36, 33, 24].
3 SANE Architecture
Previous attempts at system security have not taken a holistic approach. The
approaches typically focused on a major component of the system. For instance,
operating system research has usually ignored the bootstrap process of the host.
As a result, a trustworthy operating system is started by an untrustworthy boot-
strap! This creates serious security problems since most Operating Systems re-
quire some lower level services, e.g., firmware, for trustworthy initiahzation and
operation. A major design goal of SANE [3] was to reduce the number and size
of components that are assumed as trustworthy. A second major design goal of
SANE was to provide a secure and reliable mechanism for establishing a secu-
rity context for active networking. An application or node could then use that
context in any manner it desired.
No practical system can avoid assumptions, however, and SANE is no dif-
ferent. Two assumptions are made by SANE. The first assumption is that the
physical security of the host is maintained through strict enforcement of a phys-
ical security policy. The second assumption SANE makes is the existence of a
Public Key Infrastructure (PKI). While a PKI is required, no assumptions are
made as to the type of PKI, e.g., hierarchical or "web of trust."[15, 31, 66, 10,11]
The overall architecture of SANE for a three-node network is shown in Fig-
ure 2.
440
The initialization of each node begins with the bootstrap. Following the
sucessful completion of the bootstrap, the operating system is started which
loads a general purpose evaluator, e.g., a Caml [34] or Java [23] runtime. The
evaluator then starts an "Active Loader" which restricts the environment pro-
vided by the evaluator. Finally, the loader loads an "Active Network Evaluator"
(ANE) which accepts and evaluates active packets, e.g., PLAN [27], Switchlet,
or ANTS [64]. The ANE then loads the SANE module to estabhsh a security
context with each network neighbor. Following the establishment of the security
context, the node is ready for secure operation within the active network.
It should be noted that the services offered by SANE can be used by most
active networking schemes. In our current system, SANE is used in conjunction
with the ALIEN architecture [1]. ALIEN is built on top of the Caml runtime,
and provides a network bytecode loader, a set of libraries, and other facilities
necessary for active networking.
The following sections describe the three components of SANE. These include
the AEGIS [5, 6] bootstrap system, the ALIEN [1] architecture, and SANE [2, 3]
itself.
AEGIS [5] modifies the standard IBM PC process so that all executable code,
except for a very small section of trustworthy code, is verified prior to execution
by using a digital signature. This is accomplished through modifications and ad-
ditions to the BIOS (Basic Input/Output System). In essence, the trustworthy
software serves as the root of an authentication chain that extends to the eval-
uator and potentially beyond, to "active" packets. In the AEGIS boot process,
either the Active Network element is started, or a recovery process is entered to
repair any integrity failure detected. Once the repair is completed, the system
is restarted to ensure that the system boots. This entire process occurs with-
out user intervention. AEGIS can also be used to maintain the hardware and
software configuration of a machine.
It should be noted that AEGIS does not verify the correctness of a software
component. Such a component could contain an exploitable fiaw. The goal of
AEGIS is to prevent tampering of components that are considered trustworthy
by the system administrator. AEGIS verifies the integrity of already trusted
components. The nature of this trust is outside the scope of this paper.
Other work on the subject of secure bootstrapping includes [59, 65,14, 32, 25].
A more extensive review of AEGIS and its differences with the above systems
can be found in [5, 6].
AEGIS Layered Boot and Recovery Process. AEGIS divides the boot
process into several levels to simplify and organize the BIOS modifications, as
shown in Figure 3. Each increasing level adds functionality to the system, pro-
viding correspondingly higher levels of abstraction. The lowest level is Level 0.
Level 0 contains the small section of trustworthy software, digital signatures.
441
Opcniling System
Operating System
Y
Libraries SAX SAX Libraries
_A_ . /^ -
f Switchlet j
r Switchlet \ *
F i g . 2 . S A N E Network A r c h i t e c t u r e
public key certificates, and recovery code. The integrity of this level is assumed
as valid. We do, however, perform an initial checksum test to identify PROM
failures. The first level contains the remainder of the usual BIOS code and the
CMOS. The second level contains all of the expansion cards and their associated
ROMs, if any. The third level contains the operating system boot sector. These
are resident on the bootable device and are responsible for loading the operating
system kernel. The fourth level contains the operating system, and the fifth and
final level contains the ALIEN architecture and other active nodes..
The transition between levels in a traditional boot process is accomplished
with a jump or a call instruction without any attempt at verifying the integrity
of the next level. AEGIS, on the other hand, uses public key cryptography and
cryptographic hashes to protect the transition from each lower level to the next
higher one, and its recovery process through a trusted repository ensures the
integrity of the next level in the event of failures [6].
442
The trusted repository can either be an expansion ROM board that contains
verified copies of the required software, or it can be another Active node. If the
repository is a ROM board, then simple memory copies can repair or shadow
failures. In the case of a network host, the detection of an integrity failure causes
the system to boot into a recovery kernel contained on the network card ROM.
The recovery kernel contacts a "trusted" host through the secure protocol de-
scribed in [6, 7] to recover a signed copy of the failed component. The failed
component is then shadowed or repaired, and the system is restarted (warm
boot).
W yXctive
rsJetwork
£^lement
•4 ,
f
^
Af ^ ^ 1 "
— — — —J _.^^^ p^^.f^
which, in turn, provide access to shared resources. Further, under this runtime,
memory is a shared resource. The role of ALIEN is to control the access to these
shared resources and thereby ensure that a loaded program (called "switchlet")
does not exceed its resource limits (ALIEN is not responsible for determining
those hmits).
ALIEN itself is built of three major components. The Loader provides the
interface to the Objective Caml runtime system. The Core Switchlet builds on
the Loader both by providing the security-related restrictions required and by
providing more generally useful interfaces to low-level functions. Finally, the
libraries are sets of utility routines. Each of these pieces will be briefly covered
in turn in the following paragraphs.
The Loader. The Loader provides the core of ALIEN's functionality. It provides
the interface to the operating system (through the language runtime) plus some
essential functions to allow system startup and loading of switchlets, as shown in
Table 1. Thus, it defines the "view of the world" for the rest of ALIEN. Moreover,
since security involves interaction with either the external system or with other
switchlets, the Loader provides the basis of security.
It should be noted that Loader provides mechanisms rather than pohcy;
pohcies in the Core Switchlet can be changed by changing pieces of the Core
Switchlet.
The Core Switchlet. Above the Loader is the Core Switchlet. It is responsible
for providing the interface that switchlets see. It relies upon the Loader for
access to operating system resources, and then layers additional mechanisms
to add security and, often, utility. In providing an interface to switchlets, it
determines the security policies of the system. By including or excluding any
function, it can determine what switchlets can or cannot do. Since it is loadable,
the administrator can change or upgrade its pieces as necessary. This also allows
for changes in the security policy.
The policies of the Core Switchlet are enforced through a combination of
module thinning and type safety. Type safety ensures that a switchlet can only
access data or call functions that it can name. This allows implementations of
ALIEN that run in a single address space, thus avoiding the overheads normally
associated with crossing hardware-enforced boundaries. [47].
Module thinning allows the Core Switchlet to present a limited interface to
switchlets. Combining this with type safety, switchlets can be prevented from
444
calling functions or accessing data even though they share an address space.
It is even possible to differentiate switchlets so as to provide a rich interface
to a trusted switchlet or to provide a very limited interface to an anonymous
switchlet. Similar approaches have been taken in [33, 9, 61].
In many ways, the interface that the Core Switchlet presents to switchlets and
libraries is like the system call interface that a kernel presents to applications.
Through design of the interface the system can control access to underlying
resources. With a well-designed interface, the caller can combine the functions
provided to get useful work done. Table 2 shows the functionality provided by
the Core Switchlet.
language primitives policy for access to the basic functions of the language
operating system access policy for access to the operating system calls
network access policy and mechanism for access to the network
thread access policy for access to threads primitives
loading support policy and mechanism to support loading of switchlets
message logging policy and mechanism for adding messages to the log file
The Librciry. The library is a set of functions which provide useful routines
that do not require privilege to run. The proper set of functions for the library is
a continuing area of research. Some of the things that are in the library for the
experiments we have performed include utility functions and implementations of
IP and UDP [52].
3.3 S A N E Services
SANE builds on AEGIS and ALIEN in order to provide security services for an
active network. We believe that these services are required for the deployment of
a robust active infrastructure. This is not to say that they contain all the security
mechanisms one would ever want. Rather, they are basic building blocks needed
for possibly more advanced mechanisms. These services include:
^ To the extent that the cryptographic hash functions employed are resistant to colli-
sions.
448
5 Acknowledgements
Portions of this paper are u p d a t e d from [3] and [4]. This work was supported by
D A R P A under Contract #N66001-96-C-852, with additional support from the
Intel Corporation.
References
[1] D. S. Alexander. ALIEN: A Generalized Computing Model of Active Networks.
PhD thesis, University of Pennsylvania, September 1998.
[2] D. S. Alexander, W. A. Arbaugh, M. Hicks, P. Kakkar, A. D. Keromytis, J. T.
Moore, C. A. Gunter, S. M. Nettles, and J. M. Smith. The Switch Ware Active
Network Architecture. IEEE Network Magazine, special issue on Active and Pro-
grammable Networks, 12(3):29-36, 1998.
[3] D. S. Alexander, W. A. Arbaugh, A. D. Keromytis, and J. M. Smith. A Secure
Active Network Environment Architecture: Realization in SwitchWaxe. IEEE
Network Magazine, special issue on Active and Programmable Networks, 12(3):37-
45, 1998.
[4] D. Scott Alexander, Wilham A. Arbaugh, Angelos D. Keromytis, and Jonathan M.
Smith. Safety and Security of Programmable Network Infrastructures. IEEE
Communications Magazine, 36(10):84 - 92, 1998.
[5] W. A. Arbaugh, D. J. Farber, and J. M. Smith. A Secure and Reliable Bootstrap
Architecture. In Proceedings 1997 IEEE Symposium on Security and Privacy,
pages 65-71, May 1997.
[6] W. A. Arbaugh, A. D. Keromytis, D. J. Farber, and J. M. Smith. Automated
Recovery in a Secure Bootstrap Process. In Proceedings of Network and Distributed
System Security Symposium, pages 155-167. Internet Society, March 1998.
[7] W. A. Arbaugh, A. D. Keromytis, and J. M. Smith. DHCP-I-I-: Applying an effi-
cient implementation method for fail-stop cryptographic protocols. In Proceedings
of Global Internet (GlobeCom) '98, November 1998.
[8] R. Atkinson. Security Architecture for the Internet Protocol. RFC 1825, August
1995.
[9] B. Bershad, S. Savage, P. Pardyak, E. G. Sirer, M. Fiuczynski, D. Becker, S. Eg-
gers, and C. Chambers. Extensibility, safety and performance in the spin operating
system. In Proc. 15th SOSP, pages 267-284, December 1995.
[10] M. Blaze, J. Feigenbaum, J. loannidis, and A. Keromytis.
The KeyNote Trust-Management System. Work in Progress,
http://www.cis.upenn.edu/"angelos/keynote.html, June 1998.
[11] M. Blaze, J. Feigenbaum, J. loannidis, and A. Keromytis. The role of trust man-
agement in distributed systems security. In Secure Internet Programming [60].
[12] R. Braden, L. Zhsing, S. Berson, S. Herzog, and S. Jamin. Resource ReSerVation
Protocol (RSVP) - Version 1 Functional Specification. Internet RFC 2208, 1997.
449
1 Introduction
A mobile agent is an executing program which can migrate from machine to
machine in a heterogeneous network under its own control. Its utility resides in
the fact that, by enabling the movement of both code and data, it allows the pro-
grammer to reduce network load or to create application-specific communication
protocols. Implementing reliable and secure agent systems is however a difficult
task. The reason is that mobile agents are to execute autonomously in inherently
open and dynamic environments, which moreover may be characterized as:
— Vulnerable: Their openness makes them an easy target for direct attacks.
The security issues in mobile agent systems have been classified into five
categories [1]: (1) transfer security, (2) authentication and authorization, (3)
host system security, (4) computational environment security, and (5) mobile
agent system security. We are interested in the second and especially the
fifth aspects of security, which cover the problems related to initiating and
maintaining secure interaction between mobile agents meeting or residing in
the same computational environment on a given host.
— Completely decentralized: There is no global management of user identities,
and heterogeneity is common in offered services.
In our approach, we do not count on the availability of a unique user identi-
fication, because it does not scale well, and prefer to consider mobile agents
as anonymous entities. By heterogeneity, we refer to the fact that available
services may be implemented in different ways and may offer different in-
teraction protocols from site to site. Mobile agents must therefore have the
abihty to adapt to each situation.
operations. The advantage of this solution is that the interlocutor knows noth-
ing more than what is contained in the granted interface. This entails, first, that
the security policy is completely transparent to the client, since the presence
of forbidden operations is never disclosed and always reported as non-existent,
and second, that the process of learning how to interact with an interlocutor is
greatly simplified by the fact that we prevent aquisition of information which
is not relevant. More generally, a different semantics may be attached to each
interface, according to the nature of each interlocutor. In this approach, the
agent or service dynamically generates an interface corresponding to the client's
specific role^.
The idea of having several interfaces is now usual in object-oriented lan-
guages, such as Java, which is e.g. the language of the Aglets mobile agent
system [18]. The first use is to provide a gradation in the levels of visibility on
implementation details: direct instances, heirs and clients are not granted equal
accessibility to the definition of a given class. The other use pertains to the fre-
quent need to have several views of a single implementation, as attested by the
inclusion of the interface construct of Java, which partly replaces the idiomatic
uses of multiple inheritance. For instance, a client object may have to display a
specific view of itself, independently of its true implementation provider, in order
to be acceptable for a service which interacts with its clients through call-backs.
In usual object-oriented languages this is all static, i.e. the number of interfaces
and the contents of these interfaces are defined at compile-time, as well as who
has the right to access them. Interfaces are not obtained by dynamic evaluation
of access rights, they only depend on centralized and implementation-specific
criteria such as class names. Moreover these languages generally do not directly
provide support for building invocations dynamically with the appropriate ar-
guments.
Our initial hypotheses result in a solution which is necessarily quite differ-
ent from more static approaches. Contrarily to other mobile agent systems (e.g.
based on Java[3]), in the messenger paradigm [12], that we have taken as ba-
sis for this work, (a) there is no semantic verification of an agent's code prior
to its execution (as opposed to Java's class loader mechanism), and therefore
we can take advantage of features like dynamic typing and run-time generation
of executable code; (b) there are no predefined code libraries on the platform;
this translates into a generally more adaptative behaviour, but also makes more
difficult the implementation of infrastructure for checking digitally signed code;
(c) there is no predefined equivalent of a direct method call to a mobile agent,
because communication is rather performed through the platform's shared mem-
ory; the agent's protection domain is its private memory space, so that it has
absolute control over the visibility of its data. These are the foundations of our
approach and constitute the specificity of our proposal.
We try to provide a uniform answer to the design issues encountered when de-
signing complex applications relying on secure cooperation between many mobile
' The interface can be seen as a meta-level [24] which performs the access right veri-
fications.
455
agents from different origins. For instance, unlike objects in statically configured
distributed applications, mobile agents cannot maintain permanent references
to directly locate each other and communicate through access points known in
advance. They need directories to find and to identify each other, as well as a
uniform mechanism to initiate interaction, whether they belong to the same ser-
vice or not. Moreover, mobile agents running under the same authority should
not have the same rights if they implement different functionalities, and mobile
agents implemented by different parties should be viewable as equivalent if they
fulfill the same task. Therefore we promote an approach where each agent's access
rights are determined according to its functionality, and not solely dependent on
its implementation or authority.
2 T h e Messenger P a r a d i g m
rrr-ri prOCeSS
^-'-'-^ queues
channels
'•. arriving
messengers ^
Fig. 1. Logical view of a messenger platform
dictionary which is not browsable may not be listed: one can then only get the
values for which one knows the corresponding key (Figure 1).
Messengers are basic mobile agent entities which execute in their own local
context defined by a private address space. This constitutes the messengers'
basic protection domain. Messengers do not directly provide any interface or
callable procedures. They can however communicate through a platform-level
shared memory (Figure 2) and coordinate their work by the means of queues.
Data in the shared memory area (e.g. the string ' a b c ' of Figure 2) remains
hidden as long as no reference to it is published by its creator, by insertion in a
global browsable dictionary (the key y is here associated to the string ' a b c ' ) .
^ In principle a resident or sedentary messenger will not migrate after its arrival on
a given platform, unless it is forced to (local resources too expensive, not enough
clients, detection of threats to its security, etc.)
458
globaldict (r{w)
-iv seinadict (r\x) (service access point)
c ef k2
fullname ~ "distributed semaphore"
version - "1.0"
P - procedure (x)
V - procedure(x)
create ~ queue
A messenger wishing to publish its service will choose a secret key kl and
generate a public key k2 (using a one-way function). The tuple (k2, SAP) is
inserted into the global dictionary globaldict. Then the service provider has to
add to the service dictionary servdict an entry (k2, e x p l i c i t s e r v i c e name).
Because s e r v d i c t is a browsable dictionary, any messenger can search for the
name of the service it is looking for; however only messengers possessing the
original key kl can remove the corresponding entry.
Clients will gain access to a service by the means of its name (e.g. distsema as
in fig. 3). They will then use the corresponding key k2 to obtain the desired SAP
from globaldict (which is not browsable). The latter operation will finally enable
the client-furnisher interaction: in the SAP, each procedure name is associated
459
with the corresponding code, which is not readable (and thus may not be copied),
but only executable.
This first messenger service publication mechanism uses a uniform interface
structure allowing for elementary interactions between messengers. In this model,
a unique access policy is enforced for all service clients, with a single level of
protection and no possibihty of discrimination. Moreover, messengers of a same
family must access internally shared information through ad-hoc strategies, in-
volving e.g. predefined knowledge of hidden data structures, instead of resorting
to homogeneous protected interfacing mechanisms. For higher-level abstractions,
such as messenger families, more elaborated schemes providing multiple protec-
tion levels are needed. Our solution is described in the following section.
3 Interlocutor-Specific Interfaces
, . . ,. Semaphore
® obtam mtertace service
We have seen that the only criteria for accessing a service SAP is the knowl-
edge of the service name. Phases 1 and 2 of Figure 4 correspond to the single-
level interface access to SAP. Phases 3, 4, 5 and 6 are related to the more
elaborate protection scheme. Instead of obtaining directly a complete SAP, the
G e t l n t e r f a c e O procedure which is found in phase 3 must be called by the
client, who also gives as argument a secret key for authentication. On the basis
of this authentication, the service will decide to which level it wants to grant
access to its own functionality, and return to the client a correspondingly gen-
erated interface. Note that the simplistic authentication scheme outlined here is
just a conceptual view; in reality we may need symmetric keys or some ad-hoc
infrastructure such as Kerberos [5].
The secret key described here determines how a client wants to identify itself.
Several identities may therefore be adopted: this may be useful if a messenger is
a composition of two services and must be viewable alternatively as belonging to
the one or to the other. A family is thus characterized by the possession of a se-
cret key which enables each messenger to authenticate itself w.r.t. other members
of the family. For security reasons, a messenger's membership of one (or several)
families is not a visible attribute. Similarly, messengers do not have identities
(they are anonymous) and cannot be directly referenced either ^. A family is
not the same as a type, because small auxiliary messengers, which receive lim-
ited functionality for executing precise sub-tasks, such as to spread information
within a distributed service, are also members of the family implementing this
service.
The granted interface, with its (procedure name, procedure code) tuple
set, is created in the receiver's private memory. The code associated to each pro-
cedure name is as usual accessible in execute-only mode, which prevents inspec-
tion or further copying (i.e. migration or transmission) of sensitive information
to untrusted parties. This very code is just a set of stubs which hide the location
of the actual code and data in the global memory. The interface is not trans-
ferable and is therefore equivalent to a session key. It could also be viewed as a
proxy for an object which is not remote, but simply in another local protection
domain. It must be noted that it is harder to implement non-transferable objects
in systems such as Java, which do not provide separate address spaces within
a single virtual machine. One possibility is to let the interface stubs check that
the current client is the one that originally received the interface, which means
that every agent must have an accessible unique identificator. The paper [25]
describes the difficulties in implementing Java-based capabilities, a notion which
is close to our interfaces.
Participants:
Resident: Defines the stationary agent that performs the local service imple-
mentation and that has to exchange critical information with other station-
ary agents.
Emissary: Mobile agent which transports the information between resident
agents.
Interface: Means by which selective access to vulnerable information is pro-
vided.
Repository: Location of information that must be only accessible through spe-
cific interfaces.
Platform A Platform B
getlnterface(I
creates
getData{ initO
getlnterface(I
creates
update{)
mO cod'
Compiler
Classes
shared memory
tance. Practically, this is realized by adding a class method called inherit, which
is available in its full form only through the protected interface of the class. Mes-
sengers which do not have enough privileges will only obtain a crippled version
of the inherit method through the public interface, and thus they will not be
able to inherit entities defined with the protected visibility mode. This approach
partly solves the security holes of Java related to malicious sub-classing as de-
scribed in [1]: now, if a client trusts a given root class, he can be more confident
that the actual instances he deals with are not harmful to him, and that he will
not receive harmful objects profiting from the sub-typing mechanism.
5 Related Work
Some distributed systems include a notion of computational reflection which
enables them to inspect and modify their own structure, in particular their
interface. The HADAS system [21] for instance has mobile objects, called am-
bassadors, which provide an interface for remote applications and can be seen as
an advanced form of proxy. Ambassadors are reflexive in the sense that they can
change their interface according to local conditions, and can receive and even
tailor part of the functionality of the remote application they represent. They
however are not autonomous, since they cannot migrate any further, which is
similar to the limitation of Java applets [3]. They do not either exploit their
computational reflection with the objective of enhancing security.
In Dalang [20], reflection and code modification are used to adapt software
components as they are downloaded into the executing environement. This work
is based on Java and is not specific to the mobile agent domain. When the byte
code is loaded, a specialized class loader analyzes it and modifies the behaviour
by generating an interface with code to catch methods calls. This approach is
thus based on a meta-object protocol that transparently enables security-related
verifications. Interface generation in the sense of Dalang does not convey the
same meaning as in our approach, because the single interface that is created
is shared by all interlocutors. Another difference is that the associated code
generation mechanism is not triggered at run-time, but at load-time.
An interface in the Distributed Component Object Model (DCOM) [6] is a
set of functions bound to a certain object which implements them. Each object
may introduce several interfaces and a user may query one of them using the
Query I n t e r f a c e function, which itself belongs to a default interface supported
by every object. Query I n t e r f a c e changes the semantics of the object as seen by
its user. This approach is limited to traditional distributed systems and does not
address mobile agents. Authentication is not performed by DCOM itself, but by
the lower layers it is built upon (DCE RPC [7]). It is now unclear how well these
two levels integrate, and whether the delivered interfaces can be customized on
the basis of the results of the authentication in order to enhance security.
In the frame of CORBA [8], several current initiatives tend to answer is-
sues described in this paper. The Multiple Interfaces RFP [9] deals with the
resolution of conflicts between multiple interfaces on the same object, while the
Composition Facility [9] provides the means for objects to be composed of log-
ically distinct services by the use of multiple interface definitions. In CORBA
in general, as well as in its draft Mobile Agent System Interoperability Facility
(MASIF) [10], authentication for method invocations is performed on a call-by-
call basis, and interface delivery is not under the control of the implementing
objects, but of the Interface Repository mechanism.
In our approach, interfaces are not designed to be transmitted for remote
usage. The stub routines they contain cannot be migrated because the corre-
sponding code is execute-only and may not be copied. This is consistent with
the mobile agent philosophy, which tries to limit the frequency of remote method
invocations, and is indissociable from our security architecture. The granting of
466
6 Conclusion
Acknowledgements
T h e authors would hke to t h a n k the anonymous reviewers for their comments
and useful suggestions. This work was funded by t h e Swiss National Science
Foundation grant 20-47162.96.
References
1. J. Vitek, M. Serrano and D. Thanos, Security and Communication in Mobile Ob-
ject Systems, in Mobile Object Systems: Towards the Programmable Internet, Sec-
ond International Workshop, MOS'96, Linz, Austria, Selected Presentations and
Invited Papers, J. Vitek and C. Tschudin (eds), LNCS vol. 1222, July 1996.
2. C. F. Tschudin, An Introduction to the M 0 Messenger Language, Technical Report
86, (Cahier du GUI), University of Geneva, 1994.
3. J. Gosling and H. McGilton. The Java Language Environment. A White Paper.
Sun Microsystems, May 1995.
4. M. Muhugusa. Distributed Services in a Messenger Environment: The Case of
Distributed Shared-Memory. Ph.D. Thesis no 2903, University of Geneva, 1997.
5. J. G. Steiner, B. Clifford Neuman, and J.I. Schiller, Kerberos: An Authentication
Service for Open Network Systems, In Proceedings of the Winter 1988 Usenix
Conference, February 1988.
6. N. Brown and C. Kindel, Distributed Component Object
Model Protocol - DCOM/1.0, Internet draft, January 1998,
http://www.microsoft.com/oledev/olecom/draft-brown-dcom-vl-spec-02.txt
7. CAE Specification, X/Open DCE: Remote Procedure Call, X/Open Company Lim-
ited, X/Open Document Number C309. ISBN 1-85912-041-5, Reading, Berkshire,
UK, 1994.
8. Object Management Group, The Common Object Request Broker: Architecture
and Specification (Revision 2.0), Object Management Group, Framingham, Mass.,
1995.
9. Object Management Group, Multiple Interfaces and Compo-
sition, work in progress, April 1998, Web information page:
http://www.omg.org/library/schedule/Multiple_Interfaces.and_Composi-
tion.htm
10. Object Management Group, The Mobile Agents Facil-
ity, work in progress, April 1998, Web information page:
h t t p : //www. omg. org/library/schedule/Hobile_Agents_Facility_RFP. htm
11. A. Acharya, M. Ranganathan, J. Saltz, Sumatra: A Language for Resource-Aware
Mobile Programs, in Mobile Object Systems: Towards the Programmable Internet,
Second International Workshop, MOS'96, Linz, Austria, Selected Presentations
and Invited Papers, J. Vitek and C. Tschudin (eds), LNCS vol. 1222, July 1996.
12. C. F. Tschudin, The Messenger Environment M 0 - A Condensed Description, in
Mobile Object Systems: Towards the Programmable Internet, Second International
Workshop, MOS'96, Linz, Austria, Selected Presentations and Invited Papers, J.
Vitek and C. Tschudin (eds), LNCS vol. 1222, July 1996.
13. C. F. Tschudin, Open Resource Allocation, First International Workshop on Mobile
Agents (MA'97), Berlin, Germany, April 1997.
14. R. Gray, Agent TCL: A Flexible and Secure Mobile-Agent System, in Proceedings
of the fourth annual Tcl/Tk Workshop (TCL 96), July 1996.
468
15. J. E. White, Telescript Technology: The Foundation for the Electronic Market-
place, General Magic White Paper, General Magic, Inc., 1994.
16. C. F. Tschudin, On the Structuring of Computer Communications, Ph.D. Thesis,
University of Geneva, Switzerland, 1993.
17. D. Johansen, R. van Renesse and F. B. Schneider, Operating System Support for
Mobile Agents, in Proceedings of the 5th IEEE Workshop on Hot Topics in Op-
erating Systems, pages 42-45, Orcas Island, Wash., May 1994. Also available as
Technical Report TR94-1468, Department of Computer Science, Cornell Univer-
sity.
18. D. Lange, M. Oshima, G. Karjoth and K. Kosaka, Aglets: Programming Mobile
Agents in Java, in 1st International Conference on Worldvifide Computing and
its Applications (WWCA'97), T. Masuda, Y, Masunaga and M. Tsukamoto, Eds,
LNCS vol. 1274, Springer, Berlin, Germany, pp. 253-266, 1997.
19. Y. Aridor, D. Lange, Agent Design Patterns: Elements of Agent Application De-
sign. Second International Conference on Autonomous Agents (Agents'98). Min-
neapolis/St. Paul, May 10-13, 1998.
20. Ian Welch and Robert Stroud, Dynamic Adaptation of the Security Properties of
Applications and Components., ECOOP Workshop on Distributed Object Security,
Brussels, Belgium, July 1998.
21. O. Holder and I. Ben-Shaul, A Reflective Model of Mobile Software Objects, in
Proceedings of the 17th IEEE International Conference on Distributed Computing
Systems (ICDCS'97), Baltimore, Maryland, USA, May 27-30 1997.
22. F. B. Schneider, Towards Fault-tolerant and Secure Agentry, Invited paper, 11th
International Workshop on Distributed Algorithms, Saarbrcken, Germany, Sept.
1997.
23. T. Sander and C. F. Tschudin, Towards Mobile Cryptography, In proceedings of
Security&Privacy'98, May, 1998.
24. G. Kiczales, J. des Rivieres and D. G. Bobrow, The Art of the Metaobject Protocol,
MIT Press, 1991.
25. T. von Eicken, J-kernel a capability based operating system for Java, In Secure
Internet Programming, Lecture Notes in Computer Science, Springer-Verlag Inc.,
New York, NY, USA, 1999.
Introducing Trusted Third Parties to the Mobile
Agent Paradigm
Abstract. The mobile agent paradigm gains ever more acceptance for
the creation of distributed applications, particularly in the domain of
electronic commerce. In such applications, a mobile agent roams the
global Internet in search of services for its owner. One of the problems
with this approach is that malicious service providers on the agent's
itinerary can access confidential information contained in the agent or
tamper with the agent.
In this article we identify trust as a major issue in this context and
propose a pessimistic approach to trust that tries to prevent malicious
behaviour rather than correcting it. The approach relies on a trusted
and tamper-resistant hardware device that provides the mobile agent
with the means to protect itself. Finally, we show that the approach is
not limited to protecting the mobile agents of a user but can also be
extended to protect the mobile agents of a trusted third party in order
to take full advantage of the mobile agent paradigm.
1 Introduction
New approaches for distributed computing based on mobile agent technology,
such as Aglets, Telescript, or Voyager become ever more pervasive and are con-
sidered as innovative new ideas to structure distributed applications. A partic-
ularly interesting and perhaps economically important class of applications, to
which mobile agents seem well adapted, is electronic commerce.
A typical use of mobile agents in the domain of electronic commerce includes
the scenario, in which an agent roams the global Internet in search of some service
for a user (the owner of the agent). Such a service can have many different forms,
for instance, the provision of a physical good, the execution of a search for an
information item, or the notification of the occurrence of some event. The agent
is configured by the user with all the relevant information about the desired
service, the constraints that define under which conditions an offer from a service
provider is acceptable, and a list of some potential providers of the service. It
will then migrate to the sites of these service providers in order to locate the
best offer for the service sought by the user and finalize the transaction with the
chosen service provider.
Since an agent is vulnerable when it is executing on the execution platform
of a service provider, it is necessary that the user obtains some guarantees con-
cerning the protection of his agents. Consider a mobile agent that holds data for
one or several payment methods, which it needs to finalize a purchase. These
470
payment data should not be available to any principal other than the one that
actually provides the service and, thus, is entitled to receive the payment. A
malicious service provider might try to obtain the data of the payment method
without providing the service or might otherwise tamper with the agent in order
to trick it into accepting the malicious provider's offer (e.g., by removing some
information about a better offer from the memory of the agent). The usual ap-
proach that is taken to provide a user with certain guarantees concerning the
protection of his agents, is to assume that the service providers are trusted prin-
cipals [13] or to create a mechanism that enables the user to detect which of the
providers on the itinerary have misbehaved [21].
The notion of trust has long been recognized as being of paramount impor-
tance for the development of secure systems [6,10,28]. For instance, any con-
ceivable system for authenticating users needs trusted functionality that holds
the necessary authentication information (see e.g., [18,26]). Yet, the meaning
that is associated with trust or the notion of a trusted principal is hardly ever
clearly defined in these approaches and the reader is left with his intuition.
In this article we address the question of how trust in a certain principal can
be motivated based on technical reasoning and present a pessimistic approach
to trust that tries to prevent malicious behaviour rather than correcting it after
it has occurred. The approach relies on a trusted and tamper-resistant hardware
device that can be used to enforce a policy. If this policy is properly chosen, an
agent can take advantage of it in order to protect itself as well as the information
it contains from possibly malicious service providers.
The mobile agent paradigm usually identifies two interacting principals: the
owner of the agent who configures it and the executor of the agent, which may
be identical to the service provider. However, many protocols for security related
problems, especially those that are concerned with non-repudiation, require an
additional third party that often has to be trusted by the other parties. This
principal is called a trusted third party (TTP). Due to this trust requirement, the
functionality of a TTP must be realized in a trustworthy environment, which
is usually only available at the site of the TTP. Hence, the principals in the
mobile agent paradigm have to interact with the TTP server using the classical
client/server mechanisms. The approach described here, which provides protec-
tion for the mobile agents of regular users, can also be applied to mobile agents
that are owned by a TTP. This allows us to take full advantage of the mo-
bile agent paradigm and removes the need for remote messaging between the
interacting principals.
In the following Section 2, we introduce our model for mobile agents and
point out the problems related to trust within this model. Then, in Section 3,
we discuss the notion of trust and define its relation to policy, which enables us
to better assess the possible motivations for trust. In Section 4, we introduce
a trusted and tamper-resistant hardware device and a protocol, which allow us
to define certain guarantees for the execution of agents. In Section 5, we show
how this approach can be used to protect the agents of a regular user as well as
those of a TTP. In Section 6, we discuss why we consider this to be an adequate
471
way to approach the problem and what effects this has on the notion of open
systems. Finally, Section 7 concludes the paper with a summary of the main
contributions.
former does not automatically support the transfer of the current execution
state of the agent. Thus, if an agent is supposed to visit more than a single AP
(which is often referred to as multi-hop agent), the current execution state has
to be explicitly encoded in the agent's data before it can migrate to another AP.
The latter approach automatically supports the transfer of the current execution
state and allows an agent to continue its execution exactly where it left off before
initiating the migration. A mobile agent can thus easily visit as many APs as it
deems necessary to accomplish the desired task. In both approaches, the result
of the agent's remote execution can be sent directly to the agent owner in the
form of a message or kept in the execution state of the agent and extracted by
the agent owner when the agent returns.
The reason for only providing weak instead of strong mobility is that the
execution environment in the AP and the agent transport format can be much
simpler since the AP does not have to provide the current execution state (which
is, for instance, not available from the Java virtual machine) and the transport
format does not have to encode it. Also, if an agent visits only a single AP,
which is supported by weak mobility, the trust model becomes much simpler.
Any damage incurred by the agent to the AP (and thus to the agent executor) or
by the AP to the agent (and thus to the agent owner) can easily be attributed
to the other entity. If more than two principals are involved, the problem of
accountability becomes much more difficult [13]. Each principal can defer any
damage to actions by one of the other principals on the agent's itinerary.
In this article we will concentrate on the protection of a mobile agent on a
particular AP. In our solution, a multi-hop agent that visits several sites only
represents multiple instances of the same problem. Our main concern is how to
protect the agent and especially the data it contains from undue manipulation
by or undesired disclosure to the agent executor, which is mainly a question of
trust in the agent executor.
might contain some very personal information about the user's special in-
terests, which the agent executor cannot infer from simply observing the
agent's choice.
finally, an agent that merely searches for some particular financial informa-
tion (e.g., stock quotes) might, depending on the owner of the agent, convey
some very sensitive information (the mere request already conveys the inter-
est in the information).
In a conventional mobile agent system, when the agent owner sends a mobile
agent to an agent executor in order to use some service, the agent owner loses
all control over the code and data of the agent. The agent executor can:
This constellation puts the agent executor in a much stronger position than
the agent owner. The agent owner simply has to trust the agent executor not to
use the methods described above to illicitly obtain confidential information from
the agent that it has to carry in order to use the service. There is no way for the
agent owner to control or even know about the behaviour of the agent executor.
The reason for the imbalance between agent executor and agent owner in the
mobile agent model as compared to the principals in the client/server model is
that in the former approach, the agent owner has no guarantees whatsoever con-
cerning the execution of its agent. In the client/server approach, the client relies
on many guarantees that are so basic that one hardly ever thinks of them. Never-
theless, these guarantees allow the implementation of certain types of behaviour
in the client part of the distributed application that can not be implemented
in conventional agent systems (e.g., code will be executed at most once, code
will be executed correctly, or the code can rely on a reasonably accurate time
service). This is due to the fact that the client implementation is under the phys-
ical control of the service user, who can can observe what is happening in the
system and notice any irregularities. Thus, he is able to react accordingly, for
instance, to interrupt an ongoing transaction or to log any irregularities at the
client side so that they can be provided as evidence in the case of a dispute with
some server. This is opposed to the mobile agent paradigm, where logged data
can easily be deleted by the agent executor.
We intend to create an environment for mobile agents that allows them to
base their execution on assumptions similar to the client/server approach, so that
it becomes possible for a mobile agent to better protect itself from a malicious
agent executor.
474
In the optimistic approach, we give an entity the benefit of the doubt, assume
that it will behave properly, and try to punish any violation of the published
policy afterwards. In the pessimistic approach we try to prevent any violation of
the published policy in advance by effectively constraining the possible actions
of a principal to those conforming to the policy. Both of these approaches have
advantages and disadvantages.
This approach is easy to implement, since it does not require any special mea-
sures to make trusted interaction possible. This is probably the reason why it
is the basis for most business conducted today. On the other hand, it requires
some reliable mechanism to discover a policy violation after it has occurred. If
such a mechanism does not exist, then the approach degenerates to blind trust,
which indicates that there is no particular motivation to believe that a principal
475
will adhere to its published policy other than its own assertion. Blind trust is
obviously a very weak foundation for trust and not recommended for any im-
portant or financially valuable transaction. It is therefore important to make the
probability that a policy violation is discovered as high as possible by improving
controls and establishing checkpoints.
Once a policy violation is discovered and if it can further irrefutably be
attributed to one of the participants in the corresponding transaction, this prin-
cipal should be punished according to the appropriate laws and the damage
caused by the policy violation. The primary goal of this punishment is to deter
potential violators from committing a policy violation in the first place.
Depending on how this punishment is enacted, we identify the following two
motivations for the belief that an entity will adhere to its published policy:
— trust based on (a good) reputation and
- trust based on explicit punishment.
Trust based on reputation stems from the fact that the principal in question
is well known and has very little to gain through a violation of its own policy
but a lot to lose in case a policy violation is discovered. This loss is supposed
to transpire from the lost revenue due to customers taking their business to
another provider. Reputation is an asset that is expensive to build up and that
is invaluable for any company. Thus, we assume that a principal would not risk
to lose its good reputation for a small gain and will consequently rather adhere
to its policy.
Trust based on explicit punishment means that we do not trust the princi-
pal, but rather the underlying legal framework to ensure the principal's proper
behaviour. Here, we explicitly introduce a similar tradeoff as in trust based
on reputation by imposing disciplinary actions such as fines or imprisonment,
depending on the severity of the offence. The short term gain that might be
achieved through a policy violation is supposed to be negated by appropriate
punishment.
Obviously, there are many problems with this approach, such as the enforce-
ment of laws, which is usuahy expensive, quite slow, and sometimes very complex
(in particular if the laws of different countries are applicable as can be expected
for transactions on the Internet). The difficulty of very different perceptions
of punishment, where a person who has not much to lose might readily risk
some years of imprisonment for the possibility of a relatively large gain. Another
problem in the optimistic approach stems from the fact that many abuses of
confidential information are not necessarily conducted for the purposes of the
company that holds this information, but rather by malicious insiders of such
a company, who do it for strictly personal reasons or financial benefits [20,25].
Such abuses are even more difficult to discover (there are less people involved)
and to punish (it has to be decided if only the employee for malicious behaviour,
only the company for neghgence, or both have to be pursued).
The problem to reliably discover a policy violation could be resolved by
requiring a high degree of transparency. However, this is difficult to achieve and it
is quite likely that even trustworthy principals with a good reputation might not
476
^^
•^- \_/, ^^^ provides
^° 1 1
' ^^ )oon°pu,er| 1[^^ ^
©
Fig. 1. Overview of the Principals in the CryPO protocol
4.1 Notation
The described approach relies on public key cryptography [5] (such as RSA [14]).
A detailed description of cryptography and the corresponding notations is not
within the scope of this presentation. For information on this topic see e.g., [12,
17]. The notation we will use is as follows.
A principal P has a pair (or several pairs^) of keys {Kp,Kp^) where Kp is
P's public key and Kp^ its private key. Given these keys and the corresponding
algorithm, it is possible to encrypt a message m using the receiver P's public
key Kp, denoted {mJKp, so that only P can decrypt it with its private key. A
signed message, including a digital signature on the message m, generated by P
using its private key Kp^ and verifiable by anybody using the respective public
key Kp, is denoted {m}Sp-
in the following we assume the usage of optimization schemes such as encrypt-
ing a large message with a symmetric session key, which in turn is encrypted
using public key cryptography and prepended to the message as well as the use
of hash algorithms to reduce the computational complexity of signing. However,
for ease of presentation, we will not make this explicit.
is under the complete control of the tamper-resistant module. We will call this
device trusted processing environment (TPE). The TPE (see Figure 2) provides
a complete agent platform that cannot be inspected or tampered with. Any
agent residing on the TPE is thus protected by the TPE both from disclosure and
manipulation.
The TPE is a complete computer that consists of a CPU, RAM, ROM, and non-
volatile storage (e.g. hard-disk or flash RAM). It runs a virtual machine (VM)
that provides the platform for the execution of agents and guarantees the correct
execution of the agent's code according to the definition of the used language
(e.g., Java byte-code). Below the VM is the operating system that provides the
external interface to the TPE and controls the VM (e.g., protection of agents from
each other). Furthermore, the TPE contains a private key i^^pg that is known
to no principal other than the TPE - also the physical owner of the TPE does
not know the private key. This can, for instance, be achieved by generating the
private key on the TPE^. Using this approach, the private key is never available
outside of the TPE and, thus, protected by the operating system and the tamper-
resistance of the TPE. The secrecy of the private key is a crucial requirement for
the usage of the TPE to enforce a particular behaviour.
The TPE is connected to a host computer that is under the control of the
TPE's owner. This host computer can access the TPE exclusively through a well
defined interface that allows, for instance, the following operations on the TPE:
that is contained in the TPE. This property is ensured by the TPE manufacturer
(TM), which also provides the agent executor {AE) with a certificate (signed
by TM). The certificate contains information about the TPE, such as its manu-
facturer, its type, the guarantees provided, and its pubhc key. The agent owner
{AO) has to trust the TM (see Section 6) that the TPE actually does provide
the protection that is claimed in the certificate.
Usage After the participants have finished the initialization, they can execute
the usage part of the CryPO protocol:
- the AO queries the broker for the reference to the AE with which it wants
to interact (or it already holds this reference from a previous interaction).
- the AO verifies the policy of the AE whether it is acceptable as well as the
certificate CertrPE to check the manufacturer and the type of the TPE, in
order to decide if it satisfies the security requirements of the AO. If any of
these checks fail, the AO will abort the protocol.
- the AO sends the agent encrypted with the public key of the TPE, {A}KTrEJ
to the AE.
^ We had originally chosen the term object since it is more general than the term
agent.
"* A reference to an AE consists of its name, its physical address in the network, its
policy, and the certificate CertrPE for its TPE. The broker can also verify that the AE
actually controls the corresponding TPE by executing a challenge-response protocol
with the TPE via the AE.
480
s-sS^^^^IP^^^^^^^
the AE cannot decrypt {A}KTrB ^^^ can it do anything other than upload
the agent to its TPE.
the TPE decrypts {A}KTrE using its private key Kl^p^ and obtains the exe-
cutable agent A, which it will eventually start. The agent can then interact
with the local environment of the AE or with other agents on the TPE.
the agent can, after it has finished its task, migrate back to its owner
{{A}KAO) or to another AE to which it holds a reference.
^^^^^Srir^^^^^^
The obvious problem of protecting the TPE from malicious agents is inde-
pendent of the described approach and has to be tackled with additional mecha-
nisms, such as code signing and sandboxing. The problem of protecting the TPE
from tampered agents can easily be solved by concatenating the agent with a
hash of the entire agent h{A), including its execution state, before encrypting
481
it {A, h{A)}KTPE- The TPE simply has to verify the correct hash before starting
the agent.
4.4 N o t e s on feasibility
5 Usage of t h e TPE
The CryPO protocol together with the concept of a TPE guarantee the integrity
of the agent platform to the AO and protect the code and data of an agent against
manipulation and disclosure, both in transit and during execution. These guar-
antees are based on the trust relation between the AO and the TM, in which the
AO trusts the TM to properly manufacture its TPEs and to control them regu-
larly (if necessary) so that the claimed guarantees hold. The certificate enables
the AO to ensure that it really deals with a TPE from a certain manufacturer.
The above guarantees can be extended by additional properties, formulated
as rules of a policy, that can effectively be enforced by a TPE. In [24], we have
discussed how this approach can be used to allow an agent to base its execution
on results of possible previous executions on the same TPE. This can, for instance,
be used to limit the number of times an agent can be executed on a given TPE. To
achieve this, it is necessary to identify a policy that provides sufficient support
for the agent and to ensure that this policy is enforced on the TPE on which the
agent executes. With this approach, the AO does not need to trust the AE on
the proper protection of his agent, but it suffices to trust the TM. The question
why the AO should trust the TM rather than the AE is discussed in section 6.
Now we want to address a problem that requires the cooperation with a
trusted third party (TTP). Many security related protocols, in particular those
that deal with non-repudiation, rely on such a cooperation [12]. The role of the
TTP is to provide a well defined functionality (e.g., timestamping or logging of
482
We assume that the TPE of the service provider enforces the following set of
rules, detailed in its policy:
The first rules a) and b) guarantee the basic protection of the agent's code
as well as its proper execution, while c) guarantees the protection of the agent's
data from undesired disclosure and manipulation. Rule d) requires the protection
of agents from one another, which is a regular operating system functionality.
The next rule e) ensures that the agent knows the policy of the TPE to which
it is transferred. Thus, the agent can ensure that it will not be sent to a TPE
that provides insufficient protection. Finally, rule f) can be used for several
purposes. For instance, it allows an agent that contains an expiration date to
483
implement a limited lifetime (on the order of a few days or hours). Upon its
arrival the agent requests the current time and checks if this time is still within
its attributed lifetime. If its expiration date has passed or if the TPE did not
succeed to synchronize its clock, the agent can simply abort. An AE can not
prevent this if the code of the agent is protected and if it will be executed
correctly.
Consider a shopping agent that searches for a particular service for its owner.
Once it has found a suitable offer, it will negotiate the details of the service
provision, such as exact price and various QoS parameters, with the service
provider. As a special requirement, we specify that the shopping agent has to
create a log entry with a TTP server^ that contains the details of the negotiated
contract before providing the payment data. This allows the AO to reconstruct
the activities of his agent in the case of a dispute or if the agent is lost.
In order for the shopping agent to effectively conduct a negotiation it needs
to conceal some of its configuration information from the service provider, such
as the highest acceptable price or the lowest acceptable QoS parameters. Fur-
thermore, the shopping agent holds the public key of the TTP, which it needs
to verify the acknowledgement from the TTP server, as well as the payment
data, which should only be provided to the selected service provider after the
successful creation of a log entry.
If the shopping agent executes only on TPEs that enforce the policy discussed
in Section 5.1, it is clear that it is protected from any interference. Provided
that the agent is correct, no other entity will be able to access or manipulate
any data contained in the agent other than what is accessible via the methods
of its public interface. Thus, the agent can effectively negotiate with a service
provider, request the logging of the contract with a TTP server, and delay any
further actions until it has received a signed acknowledgement from the TTP
server.
The interaction described above allows for an efficient negotiation between the
agent and the service provider exploiting all the performance advantages of the
mobile agent paradigm. However, due to the special requirements of the AO, the
agent has to interact with a TTP server via a remote interaction. Apart from
the performance penalties of this remote interaction, the TTP server can also
become a bottleneck if its resources are consumed by a large number of clients.
Therefore, we propose to encapsulate the functionality of the TTP in a TTP
^ The agent could send the corresponding information directly to the AO, but since
it needs an acknowledgement for the receipt of the log message and since the AO
might not have a permanent connection to the network, it is preferable to delegate
this task to a TTP.
484
agent (TA) that can be executed on the TPE, relying on the same protection
mechanisms as the shopping agent®.
In the case of message logging, the functionality of the TTP consists of ac-
cepting arbitrary messages, storing them up to a well defined point in time t,
and responding with an acknowledgement asserting that the message has been
logged. This acknowledgement has to be signed by the TTP and must clearly
identify the message that was supposed to be logged, either by including the
message itself or, preferably, a hash of the message. Furthermore, the TTP must
be capable to reproduce a logged message up to the time t and to provide it
(exclusively to authorized principals) upon request. If necessary, a log message
can be confidentiality protected with regular encryption methods.
The task of the TA is to act as the proxy for the actual TTP on the TPE. It
will accept messages that have to be logged from agents on the TPE, store them
in a local cache, and respond with an acknowledgement in which it guarantees
that it will forward the message to the TTP server unless the TPE is destroyed
(see below). Once a log message arrives at the TTP server, it will be handled
like a normal message. In order for the TA to provide such a guarantee, it needs
access to a sufficient amount of non-volatile storage on the TPE, in which it can
safely store the log messages. Since this non-volatile storage is a limited resource
of the TPE, the TA needs a special authorization to use it. If the TPE owner does
not grant this authorization, the TA will abort.
Apart from the access to the non-volatile storage of the TPE, the requirements
of the TA are very similar to those of the shopping agent. It has to be protected
against manipulation of its code and data and it also has to conceal certain data
items from the agent executor such as two different cryptographic keys. The first
key is necessary as a means to securely forward the logged messages to the TTP
server. Since the TA is configured by the TTP, this can simply be a secret key of
a symmetric key cryptosystem. The second key is needed as a signature key to
sign the acknowledgements for logged messages. This does not necessarily have
to be the long-term signature key of the TTP, but can be a temporary key that
is validated by a certificate signed with the TTP's long-term signature key.
Again, if the T T P ensures that the TA only executes on TPEs that enforce
the policy discussed in Section 5.1 with sufficiently high assurance, it is clear that
the keys are protected and that the proper execution of the TA is guaranteed.
The remaining problem is how the TA can guarantee that logged messages that
are stored in the non-volatile storage of the TPE will eventually be forwarded to
the TTP server. The TPE owner could simply intercept all the messages of the
TA to the TTP server or, ultimately, request the TPE to terminate the TA. This
functionality has to be offered by the TPE to protect its owner from malicious
or simply buggy agents that refuse to terminate.
This problem can be solved with a supervision of the TA by the TTP and
with the help of the internal clock provided by the TPE. The TTP has to keep
A TTP might require a higher level of assurance in the protection of the TPE than
a regular user. This could be a differentiating feature of TPEs from different manu-
facturers.
485
track of all the TAs it sent to the various service providers and of an expiration
date that is associated with each TA. A TA will accept log messages only until
its expiration date. After this date it will refuse to accept and acknowledge any
further messages. Thus, the TTP has to receive a final message from the TA
after its expiration date (there should be an additional delay to accommodate
for clock skew) indicating that no further messages for the TTP are stored in
the non-volatile storage of the TPE. Provided that the TA only deletes mes-
sages from this non-volatile storage after sending them to the TTP server and
obtaining an acknowledgement, the TTP knows that all the messages that the
TA acknowledged have been forwarded to the TTP server. If this final message
is not received, the TTP will request the TPE owner to restart the TA and to
forward any messages it sends to the TTP server. Under the assumption that
the TTP has a possibility to enforce the access to the TPE by legal means, the
only possibility of the TPE owner to avoid the provision of missing messages is to
destroy the TPE. Thus, the approach cannot guarantee that all logged messages
will be delivered to the TTP server. But it can guarantee that a TPE owner can
not cheat without being discovered. Furthermore, if it can be proven that the
TPE owner intentionally destroyed the TPE, he can be punished with adequate
fines.
The problem of destruction of storage media is not new and also applies
to a regular TTP. However, it is assumed that the TTP operator implements
adequate measures to avoid this problem.
5.4 Discussion
The relocation of the TTP functionality from a remote site to the locally man-
aged TPE allows us to prevent it from becoming a bottleneck that slows down
other components. This is possible since the TPE owner can allocate as many re-
sources as necessary to the TA without having to coordinate this with the TTP.
Also, since the TA can collect and merge several log messages from interactions
of different agents with the service provider, it can accumulate several log mes-
sages and forward them in a single remote interaction. Another major advantage
of the described approach is that the remote interaction with the TTP server
is taken off the critical communication path between the agent and the service
provider. They can continue their interaction as soon as the TA has stored the
log message and sent the acknowledgement. Moreover, the entire interaction can
exploit the locally available communication links with higher bandwidth and
lower latency. This enables not only a better overall performance of the system,
but allows interactions that were not possible before due to an unreasonable
overhead. For instance, a TA could be used by two interacting parties as an
intermediate through which all messages are exchanged. The TA could, thus,
easily log the entire interaction and forward it to the TTP.
The concept of the TA is suitable for any TTP functionality that can be
wrapped up in a reasonably small object and that does not have to rely on a
large centrally managed database. Other interesting examples are timestamp-
ing or fair-exchange. The former consists of a TA that adds a timestamp to a
486
message and signs the resulting message with its signature itey. The latter is a
classical security problem, for which several solutions relying on a TTP have
been proposed [2]. The problem of fair-exchange is that of principals A and B
who want to exchange the data items DA and DB , but neither of them wants
to provide its data item before receiving that of the other principal. A TA can
facilitate the exchange by accepting the data items as well as a description of the
data items expected by the designated receivers. It will verify if the descriptions
match the actual data items (e.g., in the case of payment, it verifies if the paid
amount corresponds with the amount expected by the receiver) and, if this is
the case, deliver the data items to the designated receiver.
We have introduced the mechanism, with which an agent can take advantage of
the policy enforced by a TPE. However, as we have mentioned above, in order
for a principal to trust in the proper enforcement of this policy, it is necessary
that he also trusts the TPE manufacturer to properly design, implement, and
produce its TPEs. Since there is no way (to the knowledge of the authors) to
enforce a correct behaviour of the TPE manufacturer, it seems that the presented
approach simply replaces one required trust relationship with another one. This
is a correct observation from a theoretical point of view. Nevertheless, we believe
that the replacement of trust in an arbitrary service provider with trust in a TPE
manufacturer has several more subtle implications. We will briefly discuss the
following advantages that we identified:
— might not have the proper expertise to ensure a secure operation of its hard-
ware and to guarantee the protection of the processed data.
— is quite difficult to control, due to the sheer number of such service providers.
— might have no particular reputation (and therefore none to lose).
— might have short term goals that (in its point of view) justify a policy vio-
lation.
With the presented approach, such a service provider can easily define the
pohcy rules that it would like its TPE to enforce (by selecting from the options
offered by the TPE manufacturer) and buy the appropriate TPE from a reputable
TPE manufacturer. The service provider can then immediately benefit from the
trust that users have in the manufacturer of its TPE to convince them that it
will not maliciously abuse an agent sent by a user.
With this, the approach favours the open systems philosophy, where any
principal can possibly become a provider of services. Such a service provider
simply has to obtain a TPE from some reputable manufacturer and can then
easily convince a client that the client's confidential information is sufficiently
well protected. Thus, it becomes much easier for a new service provider to es-
tablish itself in the market.
7 Conclusion
In this paper, we have discussed the notion of trust in the context of mobile agent
systems and introduced a structuring for this problem domain. Starting from this
structure, we have proposed an approach that relies on a trusted and tamper-
resistant hardware device, which allows the prevention of malicious behaviour
rather than its correction. We believe this to be the better form of protection for
'' There is the possibihty that a TPE operator bribes a TPE manufacturer to provide
an incorrect TPE. We assume that such a behaviour is a severe offence that is subject
to criminal investigation and not within the scope of this discussion.
488
confidential data. We have shown how the approach can be used to effectively
protect the confidential data contained in the shopping agent of a user and
how it can be extended to protect specialized agents from TTPs that provide
facilitation services.
In real-life, there are limitations to the approach. Given sufficient time and
resources, a TPE operator might succeed in breaking the system and it would
thus be possible for him to violate the policy that should be enforced by the TPE.
Our goal is to make this attack so costly that it would negate a possible gain
(there may be many different implementations that provide different levels of
assurance in the protection of a TPE). As further deterrent, we assume that a non-
repudiable proof for a policy violation of an enforced policy or for an attempted
or successful breaking of a TPE might be punished much more severely than a
mere policy violation since it proves a much larger determination to commit a
criminal offence.
Acknowledgements
This research was supported by a grant from the EPFL ("Privacy" project) and
by the Swiss National Science Foundation as part of the Swiss Priority Pro-
gramme Information and Communications Structures (SPP-ICS) under project
number 5003-045364.
References
1. R. Anderson and M. Kuhn. Tamper resistance — a cautionary note. In The Second
USENIX Workshop on Electronic Commerce Proceedings, pages 1-11, Oakland,
California, November 1996.
2. H. Biirk and A. Pfitzmajin. Value exchange systems enabling security and unob-
servability. Computers & Security, 9(8):715-721, 1990.
3. A. Carzaniga, G. P. Picco, and G. Vigna. Designing distributed applications with
mobile code paxadigms. In R.Taylor, editor. Proceedings of the 19th International
Conference on Software Engineering (ICSE'97), pages 22-32. ACM Press, 1997.
4. D. M. Chess, B. Grosof, C. G. Harrison, D. Levine, C. Parris, and G. Tsudik.
Itinerant agents for mobile computing. IEEE Personal Communications, 2(3):34-
49, October 1995.
5. W. Diffie and M. E. Hellman. New directions in cryptography. IEEE Transactions
on Information Theory, IT-22(6), November 1976.
6. DoD. Trusted Computer System Evaluation Criteria (TCSEC). Technical Report
DoD 5200.28-STD, Department of Defense, December 1985.
7. J. Gosling and H. McGilton. The Java language environment. White paper. Sun
Microsystems, Inc., 1996.
8. R.S. Gray. Agent Tel; A transportable agent system. In Proceedings of the CIKM
Workshop on Intelligent Information Agents, Baltimore, MD, December 1995.
9. C. G. Harrison, D. M. Chess, and A. Kershenbaum. Mobile agents: Are they a
good idea? In Mobile Object Systems: Towards the Programmable Internet, volume
1222 of Lecture Notes in Computer Science, pages 25-47. Springer Verlag, 1997.
489
Appendix
List of Authors (Feb 1999)
Foundations
Martin Abadi
Systems Research Center
Compaq
130 Lytton Avenue
Palo Alto, CA 94301
U.S.A.
Massimo Ancona
DISI - University of Genova
Via Dodecanese 35, 16146 Genova
Italy
[email protected]
http://www.disi.unige.it/person/AnconaM/
Luca Cardelli
Microsoft Research
1 Guildhall St, Cambridge CB2-3NH
United Kingdom
lucaSluca.demon.co.uk
http://www.luca.demon.co.uk
Rocco De Nicola
Dipartimento de Sistemi e Informatica
Universita di Firenze
Via Lombroso 6/17,1-50134 Firenze
Italy
denicolaSdsi.unifi.it
http://dsi2.dsi.unifi.it/~denicola
494
Eduardo B. Fernandez
Department of Computer Science and Engineering
Florida Atlantic University
777 Glades Road, Boca Raton, Florida 33431 U.S.A.
edOcse.fau.edu
http://www.cse.fau.edu/"ed/
GianLuigi Ferrari
Dipartimento di Informatica
Universita di Pisa
Corso Italia 40,1-56125 Pisa
Italy
[email protected]
http://www.di.unipi.it/"giangi
Matthew Hennessy
School of Cognitive and Computer Science
University of Sussex
Falmer, Brighton BNl 9QH
United Kingdom
[email protected]
http://www.cogs.susx.ac.uk/users/matthewh/
Xavier Leroy
INRIA Rocquencourt
Domaine de Voluceau, 78153 Le Chesnay
France
Xavier.LeroyOinria.fr
Rosario Pugliese
Dipartimento de Sistemi e Informatica
Universita di Firenze
Via Lombroso 6/17,1-50134 Firenze
Italy
puglieseSdsi.unifi.it
http://dsi2.dsi.unifi.it/"pugliese
James Riely
Department of Computer Science
North Carolina State University
Raleigh, NC 27695-7534
U.S.A.
rielyScsc.ncsu.edu
http://www.CSC.ncsu.edu/faculty/riely
495
Francois Rouaix
Liquid Market Inc
5757 West Century Bid
Los Angeles CA 90045
U.S.A.
Vipin Swarup
The MITRE Corporation
202 Burlington Road
Bedford, MA 01730
U.S.A.
Javier Thayer
The MITRE Corporation
202 Burlington Road
Bedford, MA 01730
U.S.A.
Concepts
Tuomas Aura
Helsinki University of Technology
Laboratory for Theoretical Computer Science
P.O. Box 1100, FIN-02015 HUT
Finland
[email protected]
http://www.tcs.hut.fi/
Matt Blaze
AT&T Labs - research
180 Park Avenue
Florham Park, NJ 07932
U.S.A.
[email protected]
Gerald Brose
Institut fiir Informatik
Freie Universitat Berlin
TakustraBe 9, D-14195 Berlin
Germany
[email protected]
http://www.inf.fu-berlin.de/"brose
496
Joan Feigenbaum
AT&T Labs - research
180 Park Avenue
Florham Park, NJ 07932
U.S.A.
j f©research.att.com
John loannidis
AT&T Labs - research
180 Park Avenue
Florham Park, NJ 07932
U.S.A.
j [email protected]
Angelos D. Keromytis
Distributed Systems Lab
CIS Department University of Pennsylvania
200 S. 33rd Str.
Philadelphia, PA 19104
U.S.A.
angelosSdsl.cis.upenn.edu
Christian F. Tschudin
Department of Computer Systems
Uppsala University
Box 325, SE - 751 05 Uppsala
Sweden
tschudinSdocs.uu.se
http://www.docs.uu.se/ tschudin/
Bennet S. Yee
Department Computer Science and Engineering, 0114
University of California San Diego
9500 Oilman Dr
La Jolla,CA 92093-0114
U.S.A.
[email protected]
http://www.cse.ucsd.edu/~bsy/
Volker Roth
Fraunhofer Institut fiir Graphische Datenverarbeitung
RundeturmstraBe 6, D-64283
Germany
[email protected]
http: //www. igd. fhg.de/"'vroth
497
Implementations
Martin Abadi
See contact information above.
Anurag Acharya
Department of Computer Science
University of California
Santa Barbara, CA 93106
U.S.A.
[email protected]
D. Scott Alexander
Bell Labs, Lucent Technologies
600 Mountain Avenue
Murray Hill, NJ 07974
U.S.A.
salexSresearch.bell-labs.com
William A. Arbaugh
Distributed Systems Lab
CIS Department University of Pennsylvania
200 S. 33rd Str.
Philadelphia, PA 19104
U.S.A.
waaOdsl.cis.upenn.edu
Brian N. Bershad
Department of Computer Science and Engineering
University of Washington
Box 352350
Seattle, WA 98195
U.S.A.
[email protected]
http://www.cs.washington.edu/homes/bershad
Mike Burrows
Systems Research Center
Compaq
130 Lytton Avenue
Palo Alto, CA 94301
U.S.A.
498
Levente Buttyan
Institute for computer Communications and Applications
Swiss Federal Institute of Technology (EPFL)
1015 Lausanne
Switzerland
[email protected]
http://icawww.epf1.ch/buttyan
Chi-Chao Chang
Department of Computer Science
Cornell University
Ithaca, NY 14850
U.S.A.
[email protected]
http://simon.cs.Cornell.edu/Info/People/chichao/chichao.html
Vipin Chaudhary
Depeartment of ECE
Wayne State University
Detroit, MI 48202
U.S.A.
vipinOece.eng.wayne.edu
Grzegorz Czajkowski
Department of Computer Science
Cornell University
Ithaca, NY 14850
U.S.A.
[email protected]
http://simon.cs.Cornell.edu/home/grzes/
Guy Edjlali
Depeartment of ECE
Wayne State University
Detroit, MI 48202
U.S.A.
[email protected]
499
Robert Grimm
Department of Computer Science and Engineering
University of Washington
Box 98195
Seattle, WA 98195
U.S.A.
[email protected]
http://www.cs.Washington.edu/homes/rgrimm
Jiirgen Harms
Centre Universitaire d'Informatique
University of Geneva
Rue General Dufour 24
1211 Geneve 4
Switzerland
Juergen.HarmsOcui.unige.ch
Chris Hawblitzel
Department of Computer Science
Cornell University
Ithaca, NY 14850
U.S.A.
[email protected]
http://simon.cs.Cornell.edu/Info/People/hawblitz/hawblitz.html
Deyu Hu
Department of Computer Science
Cornell University
Ithaca, NY 14850
U.S.A.
[email protected]
http://simon.cs.Cornell.edu/Info/People/hu/hu.html
500
Jarle G. Hulaas
Centre Universitaire d'Informatique
University of Geneva
Rue General Dufour 24
1211 Geneve 4
Switzerland
[email protected]
Trent Jaeger
IBM Thomas J. Watson Research Center
30 Saw Mill River Rd.
Hawthorne, NY 10532
U.S.A.
jaegertSwatson.ibm.com
Michael B. Jones
Microsoft Research, Microsoft Corporation
One Microsoft Way, Building 31/2260
Redmond, WA 98052
U.S.A.
mbj ©microsoft.com
http://www.research.microsoft.com/~mb:
Angelos D. Keromytis
See contact information above.
Jonathan M. Smith
Distributed Systems Lab
CIS Department University of Pennsylvania
200 S. 33rd Str.
Philadelphia, PA 19104
U.S.A.
j ms (ids 1. c i s . u p e n n . edu
Dan Spoonhower
Department of Computer Science
Cornell University
Ithaca, NY 14850
U.S.A.
[email protected]
501
Sebastian Staamann
Operating Systems Laboratory (LSE)
Swiss Federal Institute of Technology (EPFL)
1015 Lausanne
Switzerland
[email protected]
http://Isewww.epf1.ch/~staa
Alex Villazon
Centre Universitaire d'Informatique
University of Geneva
Rue General Dufour 24
1211 Geneve 4
Switzerland
[email protected]
Uwe G. WUhelm
Operating Systems Laboratory (LSE)
Swiss Federal Institute of Technology (EPFL)
1015 Lausanne
Switzerland
Uwe .Wilhelmliepf 1. ch
http://Isewww.epf1.ch/~wilhelm
Edward Wobber
Systems Research Center
Compaq
130 Lytton Avenue
Palo Alto, CA 94301
U.S.A.
Lecture Notes in Computer Science
For information about Vols. 1-1537
please contact your bookseller or Springer-Verlag
Vol. 1538; J. Hsiang, A. Ohori (Eds.), Advances in Com- Vol. 1556: S. Tavares, H. Meijer (Eds.), Selected Areas
puting Science - ASIAN'98. Proceedings, 1998. X, 305 in Cryptography. Proceedings, 1998. IX, 377 pages. 1999.
pages. 1998. Vol. 1557: P. Zinterhof, M. Vajtersic, A. Uhl (Eds.), Par-
Vol. 1539: O. RUthing, Interacting Code Motion Trans- allel Computation. Proceedings, 1999. XV, 604 pages.
formations; Their Impact and Their Complexity. XXI,225 1999.
pages. 1998. Vol. 1558: H. J.v.d. Herik, H. lida (Eds.), Computers and
Vol. 1540; C. Beeri, P. Buneman (Eds.), Database Theory Games. Proceedings, 1998. XVIII, 337 pages. 1999.
- ICDT'99. Proceedings, 1999. XI, 489 pages. 1999. Vol. 1559: P. Flener (Ed.), Logic-Based Program Syn-
Vol. 1541; B. Kigstrom, J. Dongarra, E. Elmroth, J. thesis and Transformation. Proceedings, 1998. X, 331
Wasniewski (Eds.), Applied Parallel Computing. Proceed- pages. 1999.
ings, 1998. XIV, 586 pages. 1998. Vol. 1560: K. Imai, Y. Zheng (Eds.), Public Key Cryp-
Vol. 1542; H.I. Christensen (Ed.), Computer Vision Sys- tography. Proceedings, 1999. IX, 327 pages. 1999.
tems. Proceedings, 1999. XI, 554 pages. 1999. Vol. 1561: I. Damgjrd (Ed.), Lectures on Data
Vol. 1543; S. Demeyer, J. Bosch (Eds.), Object-Oriented Security.VII, 250 pages. 1999.
Technology ECOOP'98 Workshop Reader. 1998. XXII, Vol. 1562; C.L. Nehaniv (Ed.), Computation for Meta-
573 pages. 1998. phors, Analogy, and Agents. X, 389 pages. 1999.
Vol. 1544: C. Zhang, D. Lukose (Eds.), Multi-Agent Sys- (Subseries LNAI).
tems. Proceedings, 1998. VII, 195 pages. 1998. (Subseries Vol. 1563: Ch. Meinel, S. Tison (Eds.), STAGS 99. Pro-
LNAI). ceedings, 1999. XIV, 582 pages. 1999.
Vol. 1545: A. Birk, J. Demiris (Eds.), Learning Robots. Vol. 1565; P. P. Chen, J. Akoka, H. Kangassalo, B.
Proceedings, 1996. IX, 188 pages. 1998. (Subseries Thalheim (Eds.), Conceptual Modeling. XXIV, 303 pages.
LNAI). 1999.
Vol. 1546: B. Moller, J.V. Tucker (Eds.), Prospects for Vol. 1567; P. Antsaklis, W. Kohn, M. Lemmon, A.
Hardware Foundations. Survey Chapters, 1998. X, 468 Nerode, S. Sastry (Eds.), Hybrid Systems V. X, 445 pages.
pages. 1998. 1999.
Vol. 1547; S.H. Whitesides (Ed.), Graph Drawing. Pro- Vol. 1568: G. Bertrand, M. Couprie, L. Perroton (Eds.),
ceedings 1998. Xn, 468 pages. 1998. Discrete Geometry for Computer Imagery. Proceedings,
Vol. 1548; A.M. Haeberer (Ed.), Algebraic Methodology 1999. XI, 459 pages. 1999.
and Software Technology. Proceedings, 1999. XI, 531 Vol. 1569; F.W. Vaandrager, J.H. van Schuppen (Eds.),
pages. 1999. Hybrid Systems; Computation and Control. Proceedings,
Vol. 1549; M. Pettersson, Compiling Natural Semantics. 1999. X, 271 pages. 1999.
XVII, 240 pages. 1999. Vol. 1570: F. Puppe (Ed.), XPS-99: Knowledge-Based
Vol. 1550; B. Christiansen, B. Crispo, W.S. Harbison, Systems. VIII, 227 pages. 1999. (Subseries LNAI).
M. Roe (Eds.), Security Protocols. Proceedings, 1998. Vol. 1571; P. Noriega, C. Sierra (Eds.), Agent Mediated
VIII, 241 pages. 1999. Electronic Commerce. Proceedings, 1998. IX, 207 pages.
Vol. 1551; G. Gupta (Ed.), Practical Aspects of Declara- 1999. (Subseries LNAI).
tive Languages. Proceedings, 1999. VIII, 367 pgages. Vol. 1572; P. Fischer, H.U. Simon (Eds.), Computational
1999. Learning Theory. Proceedings, 1999. X, 301 pages. 1999.
Vol. 1552; Y. Kambayashi, D.L. Lee, E.-P. Lim, M.K. (Subseries LNAI).
Mohania, Y. Masunaga (Eds.), Advances in Database Vol. 1574; N. Zhong, L. Zhou (Eds.), Methodologies for
Technologies. Proceedings, 1998. XIX, 592 pages. 1999. Knowledge Discovery and Data Mining. Proceedings,
Vol. 1553; S.F. Andler, J. Hansson (Eds.), Active, Real- 1999. XV, 533 pages. 1999. (Subseries LNAI).
Time, and Temporal Database Systems. Proceedings, Vol. 1575; S. Jahnichen (Ed.), Compiler Construction.
1997. VIII, 245 pages. 1998. Proceedings, 1999. X, 301 pages. 1999.
Vol. 1554: S. Nishio, F. Kishino (Eds.), Advanced Multi- Vol. 1576; S.D. Swierstra (Ed.), Programming Languages
media Content Processing. Proceedings, 1998. XIV, 454 and Systems. Proceedings, 1999. X, 307 pages. 1999.
pages. 1999.
Vol. 1577; J.-P. Finance (Ed.), Fundamental Approaches
Vol. 1555; J.P. Muller, M.P. Singh, A.S. Rao (Eds.), In- to Software Engineering. Proceedings, 1999. X, 245 pages.
telligent Agents V. Proceedings, 1998. XXIV, 455 pages. 1999.
1999. (Subseries LNAI).
Vol. 1578: W. Thomas (Ed.), Foundations of Software Vol. 1603: J. Vitek, C D . Jensen (Eds.), Secure Internel
Science and Computation Structures. Proceedings, 1999. Programming. X, 501 pages. 1999.
X, 323 pages. 1999. Vol. 1605: J. Billington, M. Diaz, G. Rozenberg (Eds.),
Vol. 1579: W.R. Cleaveland (Ed.), Tools and Algorithms Application of Petri Nets to Communication Networks.
for the Construction and Analysis of Systems. Proceed- IX, 303 pages. 1999.
ings, 1999. XI, 445 pages. 1999. Vol. 1606: J. Mira, J.V. Sanchez-Andres (Eds.), Founda-
Vol. 1580: A. Vdkovski, K.E. Brassel, H.-J. Schek (Eds.), tions and Tools for Neural Modeling. Proceedings, Vol.
Interoperating Geographic Information Systems. Proceed- I, 1999. XXIII, 865 pages. 1999.
ings, 1999. XI, 329 pages. 1999. Vol. 1607: J. Mira, J.V. SSnchez-Andres (Eds.), Engineer-
Vol. 1581: J.-Y. Girard (Ed.), Typed Lambda Calculi and ing Applications of Bio-Inspired Artificial Neural Net-
AppUcations. Proceedings, 1999. VIII, 397 pages. 1999. works. Proceedings, Vol. II, 1999. XXIII, 907 pages. 1999.
Vol. 1582: A. Lecomte, F. Lamarche, G. Perrier (Eds.), Vol. 1609: Z. W. Rui, A. Skowron (Eds.), Foundations of
Logical Aspects of Computational Linguistics. Proceed- Intelligent Systems. Proceedings, 1999. XII, 676 pages.
ings, 1997. XI, 251 pages. 1999. (Subseries LNAI). 1999. (Subseries LNAI).
Vol. 1583: D. Scharstein, View Synthesis Using Stereo Vol. 1610: G. Cornuejols, R.E. Burkard, G.J. Woeginger
Vision. XV, 163 pages. 1999. (Eds.), Integer Programming and Combinatorial Optimi-
Vol. 1584: G. Gottlob, E. Grandjean, K. Seyr (Eds.), Com- zation. Proceedings, 1999. IX, 453 pages. 1999.
puter Science Logic. Proceedings, 1998. X, 431 pages. Vol. 1611: I. Imam, Y. Kodratoff, A. El-Dessouki, M.
1999. All (Eds.), Multiple Approaches to Intelligent Systems.
Vol. 1585: B. McKay, X. Yao, C.S. Newton, J.-H. Kim, Proceedings, 1999. XIX, 899 pages. 1999. (Subseries
T. Furuhashi (Eds.), Simulated Evolution and Learning. LNAI).
Proceedings, 1998. XIII, 472 pages. 1999. (Subseries Vol. 1612: R. Bergmann, S. Breen, M. Goker, M. Manago,
LNAI). S. Wess, Developing Industrial Case-Based Reasoning
Vol. 1586: J. Rolim et al. (Eds.), Parallel and Distributed Applications. XX, 188 pages. 1999. (Subseries LNAI).
Processing. Proceedings, 1999. XVII, 1443 pages. 1999. Vol. 1614: D.P. Huijsmans, A.W.M. Smeulders (Eds.),
Visual Information and Information Systems. Proceed-
Vol. 1587: J. Pieprzyk, R. Safavi-Naini, J. Seberry (Eds.),
ings, 1999. XVn, 827 pages. 1999.
Information Security and Privacy. Proceedings, 1999. XI,
327 pages. 1999. Vol. 1615: C. Polychronopoulos, K. Joe, A. Fukuda, S.
Vol. 1590: P. Atzeni, A. Mendelzon, G. Mecca (Eds.), Tomita (Eds.), High Performance Computing. Proceed-
The World Wide Web and Databases. Proceedings, 1998. ings, 1999. XIV, 408 pages. 1999.
VIII, 213 pages. 1999. Vol. 1617: N.V. Murray (Ed.), Automated Reasoning with
Vol. 1592: J. Stern (Ed.), Advances in Cryptology - Analytic Tableaux and Related Methods. Proceedings,
EUROCRYPT '99. Proceedings, 1999. XII, 475 pages. 1999. X, 325 pages. 1999. (Subseries LNAI).
1999. Vol. 1620: W. Horn, Y. Shahar, G. Lindberg, S.
Vol. 1593: P. Sloot, M. Bubak, A. Hoekstra, B. Hertrberger Andreassen, J. Wyatt (Eds.), Artificial Intelligence in
(Eds.), High-Performance Computing and Networking. Medicine. Proceedings, 1999. XIII, 454 pages. 1999.
Proceedings, 1999. XXIII, 1318 pages. 1999. (Subseries LNAI).
Vol. 1594: P. Ciancarini, A.L. Wolf (Eds.), Coordination Vol. 1621: D. Fensel, R. Studer (Eds.), Knowledge Ac-
Languages and Models. Proceedings, 1999. IX, 420 pages. quisition Modeling and Management. Proceedings, 1999.
1999. XI, 404 pages. 1999. (Subseries LNAI).
Vol. 1596: R. Poll, H.-M. Voigt, S. Cagnoni, D. Corne, Vol. 1622: M. Gonzalez Harbour, J. A. de la Puente (Eds.),
G.D. Smith, T.C. Fogarty (Eds.), Evolutionary Image Reliable Software Technologies - Ada-Europe'99. Pro-
Analysis, Signal Processing and Telecommunications. ceedings, 1999. XIII, 451 pages. 1999.
Proceedings, 1999. X, 225 pages. 1999. Vol. 1625: B. Reusch (Ed.), Computational Intelligence.
Vol. 1597: H. Zuidweg, M. Campolargo, J. Delgado, A. Proceedings, 1999. XIV, 710 pages. 1999.
Mullery (Eds.), Intelligence in Services and Networks. Vol. 1626: M. Jarke, A. Oberweis (Eds.), Advanced In-
Proceedings, 1999. XII, 552 pages. 1999. formation Systems Engineering. Proceedings, 1999. XIV,
Vol. 1598: R. Poll, P. Nordin, W.B. Langdon, T.C. Fogany 478 pages. 1999.
(Eds.), Genetic Programming. Proceedings, 1999. X, 283 Col. 1628: R. Guerraoui (Ed.), ECOOP'99 - Object-Ori-
pages. 1999. ented Programming. Proceedings, 1999. XIII, 529 pages.
Vol. 1599: T. Ishida(Ed.), Multiagent Platforms. Proceed- 1999.
ings, 1998. Vin, 187 pages. 1999. (Subseries LNAI). Vol. 1629: H. Leopold, N. Garcia (Eds.), Multimedia
Vol. 1601: J.-P. Katoen (Ed.), Formal Methods for Real- Applications, Services and Techniques - ECMAST'99.
Time and Probabilistic Systems. Proceedings, 1999. X, Proceedings, 1999. XV, 574 pages. 1999.
355 pages. 1999. Vol. 1634: S. Dzeroski, P. Flach (Eds.), Inductive Logic
Vol. 1602: A. Sivasubramaniam, M. Lauria (Eds.), Net- Programming. Proceedings, 1999. VIIL 303 pages. 1999.
work-Based Parallel Computing. Proceedings, 1999. VIII, (Subseries LNAI).
225 pages. 1999. Vol. 1639: S. Donatelli, J. Kleijn (Eds.), Application and
Theory of Petri Nets 1999. Proceedings, 1999. VIII, 425
pages. 1999.