Amoeba: A Distributed Operating System For The
Amoeba: A Distributed Operating System For The
Amoeba
A Distributed Operating System
for the 1990s
n the nexi decdde, computer prices system developed at the Free University
T
1
will drop 50 low that IO, 20, or per-
haps IO0 powerful microprocessors
and the Centre for Mathematics and Com-
puter Science in Amsterdam. Amoeba
per user will be feasible. All this comput- The Amoeba combines high availability, parallelism,
ing power will have to be organized in a and scalability with simplicity and high
simple, efficient, and fault-tolerant system
distributed operating performance.
that is easy to use. The basic problem with system appears to Although distributed systems are neces-
current networks of PCs and workstations sarily more complicated than centralized
is that they are not transparent; that is, users as a centralized systems and tend to be much slower, we
users are aware of the other machines. The have worked hard to achieve extremely
user logs into one machine and uses that
system, but it has the high performance: Amoeba is already one
machine only, until doing a remote login to speed, fault tolerance, of the fastest distributed systems (on its
another machine. Few if any programs take class of hardware) reported so far, and
advantage of multiple CPUs, even when all security safeguards, future versions will be even faster. With
are idle. and flexibility the current implementation, a remote pro-
We envision a system for the 1990s that cedure call can be performed in 1.4 ms on
will appear to users as a single, 1970s required for the 1990s. Sun-3/50 class machines. The file server
centralized time-sharing system. Users can deliver data continuously at 677
will not know which processors their jobs Kbytes per second.
are using (or even how many), where their The Amoeba software is based on ob-
files are stored (or how many replicated jects. An object is a piece of data on which
copies are maintained to provide high and even fewer have been implemented. well-defined operations can be performed
availability), or how processes and ma- Fewer still are actually used by anyone yet. by authorized users, independent of the
chines are communicating. All resources An early distributed system was the Cam- user's and object's locations. Objects are
will be managed completely and automati- bridge system.' Later systems were Lo- managed by server processes and named
cally by a distributed operating system. cus,* Mach,' the V - K e r n ~ l and
, ~ Chorus.s using capabilities chosen randomly from a
Few such systems have been designed, Here we describe Amoeba, a distributed sparse name space.
-.
A process is a segmented address space
shared by one or more threads of control. Processor pool Workstations
m
Processes can be created, managed, and
debugged remotely. Operations on objects d d d
are implemented using remote procedure
calls.
Amoeba has a unique, fast file system
split into two parts: The bullet service
stores immutable files contiguously on the
7
-I
Gateway
Wide area
network
disk; the directory service gives capabili- Local area
ties symbolic names and handles replica- network
tion and atomicity, eliminating the need
for a separate transaction management Specialized servers
system. (file, database, etc.)
To bridge the gap with existing systems,
Amoeba has a Unix emulation facility Figure 1. Four components of the Amoeba architecture.
consisting of a library of Unix system call
routines that make calls to the various
Amoeba server processes.
Most classical distributed systems lit-
erature describes work on parts of or as- 48 24 8 48 Bits
pects of distributed systems: distributed
file servers, distributed name servers, dis- Service Object Rights Check
tributed transaction systems, and so on. Port number field field
Here we discuss the whole system, cover-
ing most of the traditional operating sys-
tem design issues, including communica- Figure 2. The structure of a capability. The service port identifies the service
tion, protection, the file system, and pro- that manages the object. The object number specifies the object (for example,
cess management. We explain not only which file). The rights field determines which operations are permitted. The
what we did but also why we did it. check field provides cryptographic protection to keep users from tampering with
the other fields.
Overview of Amoeba
The Amoeba project6 has been under ing of dozens of source files -a number of Finally, there are gateways to other
way for nearly 10 years and has seen processors can be allocated to run many Amoeba systems that can be accessed only
numerous system redesigns and reimple- compilations in parallel. When the user is over wide area networks. For a project
mentations as design flaws became glar- finished, the processors are returned to the sponsored by the European Community
ingly apparent. This article describes the pool for other work. Although the pool we built a distributed Amoeba system that
Amoeba 4.0 system, released in 1990. processors are all multiprogrammed, the spanned several countries. The gateway
best performance is obtained by giving protects local machines from the idiosyn-
Hardware architecture. As Figure 1 each process its own processor, until the crasies of protocols that must be used over
shows, the Amoeba hardware consists of supply runs out. the wide area links.
four components: workstations, pool pro- The processor pool allows us to build a Why did we choose this architecture
cessors, specialized servers, and gateways. system in which the number of processors instead of the traditional workstation
The workstations execute only processes exceeds the number of users by an order of model? As it becomes possible to give each
that require intense user interaction - for magnitude or more, something quite im- user 10 to 100 processors, centralizing the
example, window managers, command possible in the personal workstation model computing power will allow incremental
interpreters, editors, and CAD/CAM gra- of the 1980s. The software has been de- growth, fault tolerance, and the ability for
phical front ends. Most applications, how- signed to treat the number of processors a large job to obtain a large amount of
ever, do not interact much with the user dynamically, so processors can be added computing power temporarily. Current
and are run elsewhere. as the user population grows. When a few systems have file servers, so why not let
Amoeba’s processor pool provides most processors crash, some jobs may have to be them have computer servers as well?
of the computing power. Typically it con- restarted and the computing capacity is
sists of many single-board computers, each temporarily lowered, but otherwise the Amoeba software architecture.
with several megabytes of private memory system continues normally, providing a Amoeba is an object-based system using
and a network interface. The Free Univer- degree of fault tolerance. clients and servers. Client processes use
sity, for example, has 48 such machines. A Specialized servers, the third system remote procedure calls to send requests to
pile of diskless, terminalless workstations component, are machines for running server processes for carrying out opera-
can also be used as a processor pool. dedicated processes with unusual resource tions on objects. Each object is both iden-
When a user has an application to run - demands. For example, it is best to run file tified and protected by a capability, as
for example, building a program consist- servers on machines that have disks. Figure 2 shows. Capabilities have the set
May 1990 45
Amoeba Interface Language
Interfaces for object manipulation are specified in a nota- in the array and m is the maximum number. In an out array
tion called the Amoeba Interface Language.’ AIL resembles parameter such as buffer in bio-read, the maximum size is
the notation for procedure headers in C, but it has some ex- provided by the caller. In bio-read, it is the value of the in pa-
tra syntax for automatic generation of client and server rameter bytes. The actual size of an out array parameter is
stubs. The Amoeba class for standard manipulations on given by the callee and must be less than the maximum. In
filelike objects, for instance, could be specified as follows: bio-read it is the value of the out parameter bytes - the ac-
tual number of bytes read. On an in array parameter, the
class basic-io [1000..1199] ( maximum size is set by the interface designer and must be a
constant, while the actual size is given by the caller. In
const BIO-SIZE = 30000; bio-write, it is the in value of bytes.
This AIL specification tells the stub compiler that the opera-
bio-read(’, tion codes for basic-io must be allocated in the range 1000 to
in unsigned offset, 1199. A clash of operation codes for two different classes
in out unsigned bytes, matters only if these classes are both inherited by another,
out char buffer[bytes:bytes]); bringing them together in one interface. Currently, each group
of people designing interfaces has a different range from
bio-write(*, which to allocate operation codes. Later we hope to allocate
in unsigned offset, operation codes automatically.
in out unsigned bytes, The AIL stub compiler can generate client and server stub
in char buffer[bytes:BIO-SIZE]); routines for a number of programming languages and ma-
1; chine architectures. For each parameter type, marshalling
code is compiled into the stubs that convert data types of the
The names of the operations, bio-read and bio-write, language to AIL data types and internal representations. Cur-
must be globally unique. They conventionally start with an rently, AIL handles only fairly simple data types (Boolean, in-
abbreviation of the name of their class. The first parameter, teger, floating point, character, string) and records or arrays
indicated by an asterisk, is always a capability of the object of them. However, it can easily be extended with more data
to which the operation refers. The other parameters are la- types when the need arises.
beled “in,” “out,” or ”in out” to indicate whether they are in-
put or output parameters to the operation, or both. Specify-
ing this allows the stub compiler to generate code to trans- Reference
port parameters in only one direction.
1. G. van Rossum, “AIL - A Class-Oriented Stub Generator for
The number of elements in an array parameter can be Amoeba,” Proc. Workshop on Experience with Distributed Sys-
specified by [n:m], where n is the actual number of elements rems, Springer-Veriag, Berlin, to be published in 1990.
of operations that the holder may carry out the level where most people program and plained i n a later section.
on the object coded into them, and they work, objects are named using a symbolic A l l other services (such as the directory
contain enough redundancy and crypto- hierarchical naming scheme. The direc- service) are provided by user-level pro-
graphic protection to make guessing an tory service maintains a mapping o f ASCII cesses, in contrast to, say, Unix, which has
object’s capability infeasible. Keeping path names onto capabilities and has a large monolithic kernel for these ser-
capabilities secret by embedding them i n a mechanisms for performing atomic opera- vices. B y putting as much as possible i n
huge address space i s the key to protection tions on arbitrary collections o f name-to- user space, we have achieved a flexible
i n Amoeba. Because of the cryptographic capability mappings. system without sacrificing performance.
protection, capabilities can be managed Amoeba has already gone through sev- In the Amoeba design, concessions to
outside the kernel, by user processes them- eral generations o f f i l e systems. Currently, existing operating systems and software
selves. one f i l e server i s used almost to the exclu- were carefully avoided. But a Unix emula-
Objects are implemented by the server sion of all others. The bullet service (which tion service was developed to run existing
processes that manage them. Capabilities got i t s name from being faster than a speed- software on Amoeba.
have the identity o f the object’s server ing bullet) i s a simple file server that stores
encoded into them (the service port) so immutable files as contiguous byte strings
that, given a capability, the system can both on disk and i n i t s cache. Communication
easily find a server process that manages The Amoeba kernel manages memory
the corresponding object. The remote pro- segments, supports processes containing Amoeba’s conceptual model i s that o f a
cedure call system guarantees that requests multiple threads, and handles interprocess client thread (thread o f control or light-
and replies are delivered only once, and communication. The process management Yeight process) performing operations on
only to authorized processes. facilities allow remote process creation, objects. For example, a common operation
Although at the system level objects are debugging, checkpointing, and migration, on a file object is reading data from it.
identified by their (binary) capabilities, at all using a few simple mechanisms ex- Operations are implemented by making
46 COMPUTER
Transport interface
The transport interface for the server consists of the calls * reply parameters
get-request and send-reply, as described in the section on *I
communication. They are generally part of a loop that ac- send-reply (
cepts messages, does the work, and sends back replies, as &repheader,
in this C fragment: &repbuffer,
repbuflen);
I' Code for allocating a request buffer *I ] while (I);
do I
get-request( Get-request blocks until a request comes in. Putreply
&port, blocks until the header and buffer parameters can be
&reqheader, reused. A client sends a request and waits for a reply by
&reqbuffer, calling
reqbuflen);
I' Code for unmarshalling do-operation(reqheader, reqbuffer, reqbuflen,
the request parameters repheader, repbuffer, repbuflen);
'I
I' Call the implementation routine ' I All of this code is generated automatically by the AIL com-
I' Code for marshalling the piler from the object and operation descriptions given to it.
remote procedure calls.' A client sends a This multiple-inheritance mechanism al- the length and offset parameters, and the
request message to the service that man- lows many services to inherit the same buffer for the file data. With this setup,
ages the object. A server thread accepts the interfaces for simple object manipulations, marshalling the file data (a character array)
message, carries out the request, and sends such as for changing the protection proper- takes zero time because the data can be
the client a reply. To increase performance ties on an object or deleting it. The mecha- transmitted directly from and to the argu-
and fault tolerance, multiple server pro- nism also allows all servers manipulating ments specified by the program.
cesses often jointly manage a collection of objects with filelike properties to inherit
similar objects to provide a service. the same interface for low-level file I/O Locating objects. Before a request for
(read, write, append - see sidebar on an operation on an object can be delivered
Remote procedure calls. The kernel Amoeba Interface Language). The mecha- to a server thread that manages the object,
provides three basic system calls to user nism resembles the filelike properties of such a thread must be located. All capabili-
processes: do-operation, get-request, and Unix pipe and device I/O: The Unix read ties contain a service port field, which
send-reply. The first is used by clients to and write system calls can be used on files, identifies the service that manages the
get work done. It consists of sending a terminals, pipes, tapes, and other 1/0 de- object referred to by the capability. When
message to a server and then blocking until vices. But for more detailed manipulation, a server thread makes a get-request call, it
a reply comes back. The second is used by specialized calls are available (ioctl, provides its service port to the kernel,
servers to announce their willingness to popen, and so forth). which records it in an internal table. When
accept messages addressed to a specific a client thread calls do-operation, the
port. Servers use the third call to send Remote procedure call transport. The kernel's job is to find a server thread with
replies back. All communication in Amoeba Interface Language compiler an outstanding get-request that matches
Amoeba takes this form: First a client generates code to marshal or unmarshal the the port in the capability provided by the
sends a request to a server; then the server parameters of remote procedure calls into client.
accepts the request, does the work, and and out of message buffers and then call We call the process of finding the ad-
sends back the reply. the Amoeba transport mechanism for de- dress of such a server thread locating. It
No doubt systems programmers would livery of request and reply messages (see works as follows: When a do-operation
be content with only these three system sidebar on the transport interface). Mes- call comes into a kernel, a check is made to
calls, but for most applications program- sages consist of a header and a buffer. The see if the port in question is already known.
mers they are far too primitive. Therefore header has a fixed format and contains If not, the kernel broadcasts a special lo-
a more user-oriented interface has been addressing information (including the cate packet onto the network asking if
built on top of the mechanism, to allow capability of the object that the remote anyone has an outstanding get-request for
users to think directly in terms of objects procedure call refers to), an operation code the port in question. If one or more ker-
and operations on these objects. that selects the function to be called on the nels have servers with outstanding
Corresponding to each type of object is object, and some space for additional para- get-requests, they respond by sending
a class. Classes can be composed hierar- meters. The buffer can contain data. A file their network addresses. The kernel doing
chically; that is, a class can contain opera- read or write call, for instance, uses the the broadcasting records the porthetwork
tions from one or more underlying classes. message header for the operation code plus address pair in a cache for future use.
May 1990 47
.-
Secure communication
Client requests, addressed using an
object's capability, are delivered to one of
the servers with outstanding get-request
calls on the capability's port. Ports con-
aIntruder
I
sist of large, 48-bit numbers known only
to the server processes that make up the
service and to the server's clients. For a
public service such as the file system, the
port will be known to all users. The ports
used by an ordinary user process will, in
general, be kept secret. Knowledge of a
port is taken by the system as prima facie
evidence that the sender has a right to
communicate with the service. Of course,
the service is not required to carry out I
a Client ~ Server
48 COMPUTER
Table 1. The delay in milliseconds and the bandwidth in Kbytes per second for remote procedure calls between user pro-
cesses in three common cases with three different systems. For local RPCs the client and server run on the same processor.
The Unix driver implements Amoeba RPCs under Sun Unix.
~ ~ ~ ~~ ~
Another broadcast is needed only if a throughput of 8.4megabits per second, not capabilities - those of the user’s own
server dies or migrates. counting Ethernet and Amoeba packet private objects, but also capabilities of
When Amoeba is run over a wide area headers. Table 1 shows the speeds and public objects, such as the executables of
network with a huge number of machines, throughput of local communication (com- commands, pool processors, databases,
a slightly different scheme is used. Each munication between processes on the same and public files.
server wishing to export its service sends a machine) and remote communication While a user could perhaps store his own
special message to all domains where it (communication over Ethernet between private capabilities somewhere, a system
wants its service known. (A domain could processes on different machines). Remote manager or project coordinator cannot
be a company, campus, city, or country.) In operations were carried out with requests hand out capabilities explicitly to every
each domain a dummy process called a containing 4 bytes, 8 Kbytes, 30 Kbytes, user who may access a shared public ob-
server agent is created. This process does a and empty replies. Three RPC implemen- ject. Public places are needed where users
get-request using the server’s port and tations were measured: RPCs on native can find capabilities of shared objects, so
then lies dormant until a request comes in. Amoeba, the same Amoeba protocol used that when a new object is made shareable,
Then it forwards the message to the server from a driver under Sun Unix, and Sun’s or when a shareable object changes, its
for processing. Note that a port is just a own RPCs. capability need be put in only one place.
randomly chosen 48-bit number. It does Why did we base the design on objects,
not identify a particular domain, network, capabilities, and RPCs? Objects are a natu- Hierarchical directory structure.
or machine (see sidebar on secure commu- ral way to program. By encapsulating in- Hierarchical directory structures are ideal
nication). formation, users are forced to pay attention for implementing partially shared name
to precise interfaces, while irrelevant in- spaces. Objects shared among members of
Performance. We measured the speed formation is hidden from them. Capabili- a project team can be stored in a directory
of the Amoeba remote procedure call with ties are a clean and elegant way to name that only team members have access to.
some timing tests. For example, we booted and protect objects. Using an encryption When directories are implemented as ordi-
the Amoeba kernel on two 16.7-megahertz scheme to protect objects moves capability nary objects with acapability that is needed
Motorola MC68020s, created a user pro- management out of the kernel. The RPC is to use them, group members can be given
cess on each, and let them communicate an obvious way to implement the request- access by giving them the capability of the
over a 10-megabit-per-second Ethernet. reply nature of performing operations on directory, while others are denied access
For a message consisting of just a header objects. by withholding the capability. A directory
(no data), the complete remote procedure capability is thus a capability for many
call (RPC) took 1.4 ms. With 8 Kbytes of File system other capabilities.
data it took 13.1 ms, and with 30 Kbytes it To a first approximation, a directory is a
took 44.0 ms. The latter corresponds to a Capabilities form Amoeba’s low-level set of namehapability pairs. The basic
throughput of 5.4 megabits per second, naming mechanism, but they are hard for operations on directory objects are lookup,
which is half the theoretical capacity of the people to use. Therefore an extra level of enter, and delete. The first operation looks
Ethernet and much faster than the speeds mapping is provided from symbolic hierar- up an object name in a directory and re-
most other systems achieve. Five client- chical path names to capabilities. A typical turns its capability. The other two opera-
server pairs together can achieve a total user has access to literally thousands of tions enter and delete objects from directo-
May 1990 49
always refer to mutually consistent sets of
Bullet server memory objects. In practice, this is seldom the case
File table and is, in fact, not always necessary or
desirable. But in many cases consistency is
necessary.
Atomic actions are useful for achieving
consistent updates to object sets. Protocols
r-l
for atomic updates are well understood,
and it is possible to provide a tool kit that
allows independently implemented ser-
File 1 vices to collaborate in atomic updates of
data
I IFile 2
data
multiple objects managed by several ser-
vices.
For Amoeba we chose a different ap-
proach. The directory service handles
atomic updates by allowing atomic
changes in the mapping of arbitrary name
sets onto arbitrary capability sets. The
objects referred to by these capabilities
must be immutable, either because the
services that manage them refuse to change
Figure 3. Bullet server file representation.
them (for example, the bullet service) or
because the users refrain from changing
them.
The atomic transactions provided by the
directory service are not particularly use-
ries. Since directories themselves are ob- directory service to enter the namekapa- ful for dedicated transaction-processing
jects, a directory can contain capabilities bility pair in some directory. applications (for example, banking and
for other directories, thus allowing users to All files are immutable; once created, airline reservation systems), hut they do
build an arbitrary graph structure. they cannot be changed. No write opera- prevent the glitches that sometimes result
Complex sharing can be achieved by tion is supported. Since files cannot when people use an application just as a
making directories more sophisticated change, the directory service can replicate new version is installed, or the lost update
than we have just described. In reality, a them at its leisure for redundancy. that results when two people simultane-
directory is an (n+l)-column table with Since the final file size is known when a ously update a file.
ASCII names in column 0 and capabilities file is created, files can be and are stored
in columns 1 through n. A capability for a contiguously, both on the disk and in bullet Reliability and security. The directory
directory is really a capability for a spe- servers’ caches, as Figure 3 illustrates. service is crucial to the system: Nearly
cific column of a directory. Thus, for ex- Administrative information for a file is every application depends on it for finding
ample, users could arrange their directo- thus reduced to its origin and size, plus the capabilities it needs. If the directory
ries with one column for themselves, a some ownership data. The complete ad- service stops, everything else will come to
second column for members of their group, ministrative table is loaded into the bullet a halt as well. So that no single-site failure
and a third column for everyone else. This server’s memory when it is booted. For a can bring it down, the directory service
scheme provides the same protection rules read operation the object number in the uses techniques similar to those used in
as Unix, but obviously many other schemes capability is used as an index into this fault-tolerant database systems to replicate
are possible. table, and the file is read into the cache in all its internal tables on multiple disks.
The directory service can be set up so a single (possibly multitrack) disk opera- The directory service must also work
that whenever a new object is entered in a tion. correctly and should never divulge a capa-
directory, the directory service first asks The bullet file service can deliver large bility to an entity not entitled to see it. Yet
the service managing the object to make n files from its cache or accept large files even a perfectly designed directory service
replicas, which can be physically distrib- into its cache at maximum RPC speeds, might allow unauthorized users to catch
uted for reliability. All the capabilities are that is, at 677 Kbytes per second. A remote glimpses of data. Hardware diagnostic
then entered into the directory. client can read a 4-Kbyte file from a bullet software, for example, has access to the
server’s cache (over Ethernet) in 7 ms; a 1- directory server’s disk storage. Bugs in the
Bullet service. The bullet service is a Mbyte file takes 1.6 seconds.8 operating system kernel might allow users
highly unusual file server. Each bullet Although the bullet service wastes some to read portions of the disk.
server supports only three principal opera- space because of fragmentation, its per- Directories can be encrypted so that
tions: read file, create file, and delete file. formance easily compensates for having to bugs in the directory server and the operat-
When a file is created, the user normally buy an 800-Mbyte disk to store, say, 500 ing system (or other idiosyncrasies) will
provides all the data at once, creating the Mbytes of data. not reveal confidential information. The
file and getting back a capability for it. In encryption key can be exclusive-ORed
most circumstances the user will immedi- Atomicity. Ideally, names always refer with a random number and the result stored
ately give the file a name and ask the to consistent objects, and sets of names alongside the directory, while the random
50 COMPUTER
number is put in the directory’s capability.
After giving the capability to the owner,
the directory service itself can forget the
random number. It needs the number only Process capability Host descriptor
when the directory has to be decrypted to
carry out operations on the directory, and
will always receive the number in the
capability that comes with every client’s
request.
Why did we design such an unconven-
tional file system? Partly to achieve great
speed and partly for simplicity in design
and implementation. The use of immutable
files (and some other objects) allows the
replication mechanism to be centralized in
the directory service. Immutable files are
also easy to cache (because a cached
immutable file can never become stale), an
m Segment descriptor
Process management
Amoeba processes can have multiple
threads of control. A process consists of a
segmented virtual address space and one or
more threads. Processes can be remotely
r Thread descriptor
May 1990 51
sending kernel will keep trying to commu- most Unix utilities to work on Amoeba, mong the design decisions for
nicate until the process is running again or
until it is killed. Thus, communication
continues with a process being interac-
tively debugged.
sometimes with small changes. We consid-
ered binary compatibility but rejected it for
an initial emulation package because bi-
A Amoeba we have been most
pleased with is our determina-
tion not to restrict ourselves to existing op-
nary compatibility is more complicated erating systems or operating system inter-
A running process can be stunned by a and less useful. (First, we would have to faces. Unix is an excellent operating sys-
stun request from the outside world (the choose a particular version of Unix; sec- tem, but it was not designed for distributed
stunner must have the process capability as ond, binaries usually work for only one systems. We could not have made such a
evidence of ownership) or by an uncaught machine architecture, while sources can be balanced design with a Unix interface.
exception. When the process becomes compiled for any machine architecture; Nevertheless, we found it remarkably easy
stunned, the kernel sends its state in a and third, binary emulation is bound to be to port to Amoeba all the Unix software we
process descriptor to a handler, whose slow.) wanted to use. Programs that are hard to
identity is a capability that belongs to the Our emulation facility started as a li- port are mostly for operations that Amoeba
process’ state. After examining the pro- brary of Unix routines that have the stan- handles in other ways (network access and
cess descriptor, and possibly modifying it dard Unix interface and semantics but do system maintenance and management, for
or the stunned process’ memory, the han- their work by calling the bullet service, the example).
dler can reply with either a resume or a kill directory service, and the Amoeba process Amoeba’s use of objects and capabili-
command. management facilities. The system calls ties means that when we design a service
Debugging and migration are done implemented initially were those for file we need not worry about the protection of
through stunning. The debugger takes the I/O (open, close, dup, read, write, Iseek) its objects. The capabilities mechanism
role of the handler. For migration, first the and a few of the ioctl calls for ttys. These automatically provides enough protection.
candidate process is stunned; then the were very easy t o implement under The system also provides a very uniform
handler gives the process descriptor to the Amoeba (about two weeks’ work) and and decentralized object-naming and
new host. The new host fetches memory were enough to run a surprising number of object-access mechanism.
contents from the old host in a series of file Unix utilities. Building directly on the hardware in-
read requests, starts the process, and re- Next a session server was developed to stead of on an existing operating system
turns the capability of the new process to allocate Unix PIDs and PPIDs, and to as- has been absolutely essential to Amoeba’s
the handler. Finally, the handler returns a sist in the handling of system calls involv- success. A primary goal was to design and
kill reply to the old host. Processes com- ing them (for example, fork, exec, signal, build a high-performance system, and this
municating with a process being migrated kill). The session server is also used for can hardly be done on top of another sys-
will receive “process is stunned” replies to dealing with Unix pipes and allows many tem. As far as we can tell, only systems
their attempts until the process on the old other Unix utilities to run on Amoeba. with custom-built hardware or special
host is killed. They will then get a “process Users each start one session server along- microcode can outperform Amoeba’s
not here” reaction. After they find the side their login shell. remote procedure calls and file system on
process on its new host, communication About 150 utilities now run on Amoeba comparable hardware.
will resume. without any changes to the source code. The Amoeba kernel is small and simple.
The mechanism allows command inter- We have not attempted to port some of the It implements only a few operations for
preters to cache process descriptors of the more esoteric Unix programs, but we are process management and interprocess
programs they start and kernels to cache working to make our Unix interface com- communication, but they are versatile
code segments of the processes they run, patible with some emerging standards (for and easy to use. The kernel is easy to
Combined, these caching techniques example, IEEE Posix). port between hardware platforms.
shorten process start-up times. The X Window System has been ported Amoeba now runs on VAXs and on Motor-
Our process management mechanisms to Amoeba and supports both TCPnP and ola MC68020 and MC68030 processors,
are unusual, but they are intended for an Amoeba RPCs, so an X client on Amoeba and is currently being ported to the Intel
unusual environment, one where remote can converse with an X server on Amoeba 80386.
execution is normal and local execution is and vice versa.
the exception. The boundary conditions The Unix utilities have eased the transi-
for our design were a few simple mecha- tion to Amoeba. Gradually, however,
nisms that allowed us to implement pro- many of them will be replaced by utilities
cess execution, migration, debugging, and better adapted to the Amoeba distributed
checkpointing efficiently. environment. Our new parallel Make is an
obvious example.
If we had designed a system that was
binary compatible with Unix, it would not
Unix emulation have been much of a step beyond the ideas
of the early 1970s. We wanted a new sys- Acknowledgments
Amoeba’s system interface is quite dif- tem for the 1990s, designed from the
ferent from those of today’s popular oper- ground up. If the Unix designers had con- The work described here has been supported
ating systems. We did not want to write strained themselves to being binary com- by grants from NWO, the Netherlands Organi-
hundreds of utility programs for Amoeba zation for Scientific Research; SION, the Foun-
patible with the then-popular RT- 11 oper- dation for Computer Science Research in the
from scratch, so we quickly decided to ating system, Unix would not be where it is Netherlands; OSF, the Open Software Founda-
write a Unix emulation package to allow now. tion; and Digital Equipment Corporation.
52 COMPUTER
References
1. R.M. Needham and A.J. Herbert, The
Cambridge Distributed Computing System,
Addison-Wesley, Reading, Mass., 1982.
2. B. Walker et al., “The LOCUS Distributed
Operating System,’’ Proc. Ninth Symp.
Operating System Principles, ACM, Oper-
ating SysremsReview, Vol. 17, No. 5 , 1983, Guido van Rossum is a research assistant at the Hans van Staveren is one of the implementers
pp. 49-70. Centre for Mathematics and Computer Science of the Amoeba distributed operating system,
in Amsterdam. Since 1987 he has been with the working primarily on network protocols and
3. M. Accetta et al., “Mach: New Kernel Amoeba project, working on an RPC interface kernel efficiency. Earlier he spent four years
Foundation for UNIX Development,” Proc. specification language, a Unix emulation facil- researching code generation in the framework
Summer Usenix Conference, Usenix, Sun- ity, user interface issues, and system integra- of the Amsterdam Compiler Kit.
set Beach, Calif., 1986. tion. Earlier he worked on the ABC Program- Van Staveren graduated from the Free Uni-
ming Language project. versity in Amsterdam in 1980.
4. D.R. Cheriton, “The V Distributed Van Rossum studied mathematics and com-
System,”Comm. ACM, Vol. 31, No. 3, Mar. puter science at the University of Amsterdam
1988, pp. 314-333. and received a master’s degree in 1982.
5 . M. Rozier et al., “CHORUS Distributed
Operating Systems,” Report CS/Tech. The authors can be contacted at the Centre for
Report-88-7.6, Chorus Systems, Paris, Mathematics and Computer Science, PO Box
1988. 4079, 1009 AB Amsterdam, the Netherlands.
6. S.J. Mullender and A.S. Tanenbaum, “The
Design of a Capability-Based Distributed
Operating System,’’ Computer J., Vol. 29,
NO.4, Mar. 1986, pp. 289-300.
8. R. van Renesse, J.M. van Staveren, and Andrew S. Tanenbaum is a professor of com-
A.S. Tanenbaum, “Performance of the puter science at the Free University in Amster-
Amoeba Distributed Operating System,’’ dam. His research interests include distributed
Software - Practice and Experience, Vol. operating systems, programming languages,
19, No. 3, Mar. 1989, pp. 223-234. and compilers. He is the author of the Minix
operating system, a principal designer of the
Amsterdam Compiler Kit, and a chief architect
of the Amoeba distributed operating system.
Tanenbaum received his BS from MIT and
his PhD from the University of California,
Berkeley. He is a member of ACM, the IEEE
Computer Society, and Sigma Xi. Learning Ike International, world
leader in advanced technology
education, is now offering four-day
intensive courses in Japan on
Software Development, CASE,
UNIX/C, Datacomm/Networks,
SignaVImage Processing, Project
Sape J. Mullender heads the distributed sys- Management and other computer-
tems and computer networks research group at related topics. We need technical
the Centre for Mathematics and Computer Sci-
ence in Amsterdam. He has been a visiting experts who are able to teach these
scientist at DEC’s Systems Research Center in short courses on a consulting basis
California and a visiting research fellow at in Tokyo in Japanese.
Cambridge University. Mullender’s research
interests include high-performance communi- Please phone Dr. David Collins
cation in distributed systems and the design of Robbert van Renesse is a researcher in the at (213) 417-9700.
scalable fault-tolerant distributed file servers. Computer Science Department at Cornell Uni-
He is also concerned with organization and versity on a grant from the Netherlands Organi-
protection in distributed systems that can span zation for Scientific Research. He is working on
a continent. the management of distributed systems to im-
Mullender is vice chairman and conference prove their robustness, performance, and scala-
coordinator of the ACM Special Interest Group bility.
on Operating Systems. He received his PhD at Van Renesse received his PhD in computer 053 W. Century Blvd. / Los Angeles, CA 90045
the Free University in Amsterdam. He is a science in 1989 from the Free University of
member of ACM and the IEEE. Amsterdam. Reader Service Number 4
May 1990 53
__