Technologies For Ubiquitous Supercomputing: A Java Interface To The Nexus Communication System
Technologies For Ubiquitous Supercomputing: A Java Interface To The Nexus Communication System
Technologies For Ubiquitous Supercomputing: A Java Interface To The Nexus Communication System
SUMMARY
We use the term ubiquitous supercomputing to refer to systems that integrate low- and midrange computing systems, advanced networks and remote high-end computers with the goal of
enhancing the computational power accessible from local environments. Such systems promise
to enable new applications in areas as diverse as smart instruments and collaborative environments. However, they also demand tools for transporting code between computers and for
establishing flexible, dynamic communication structures. In this article, we propose that these
requirements be satisfied by introducing Java classes that implement the global pointer and
remote service request mechanisms defined by a communication library called Nexus. Java
supports transportable code; Nexus provides communication support and represents the core
communication framework for Globus, a project building infrastructure for ubiquitous supercomputing. We explain how this NexusJava library is implemented and illustrate its use with
examples. 1997 by John Wiley & Sons, Ltd.
1. INTRODUCTION
Rapid advances in networking technologies have made it possible to construct an application that integrates resources located at multiple geographically distributed locations.
Various high-end networking experiments have demonstrated convincingly that important
new classes of applications become possible in such environments[1]. Typically, these
applications exploit high-speed networks to assemble in one (virtual) place collections of
resources that would not otherwise be accessible, such as scientific instruments, supercomputers, databases and people.
Most work on high-performance distributed computing has originated within the highperformance computing community, and these origins are reflected in the types of applications considered and the techniques used to construct these applications. Supercomputers
are highly visible, and programs typically use message passing to transfer data between
program components. The user interfaces with the application from a local system or, in
many cases, from a high-end display device[1]. While effective, these techniques have the
drawback that they hinder the widespread dissemination of the technology, for example
because sophisticated software systems must be installed at each participating site[2].
An alternative model for high-performance distributed computing focuses on making the
power of remote supercomputers accessible to users in a completely transparent manner.
The goal is to support the development of applications that execute locally (whether on
a low-end PC or high-end workstation) and exploit remote supercomputing resources to
provide enhanced services. We use the term ubiquitous supercomputing to denote this
type of computing, because by coupling low-cost local devices with remote supercomputer
resources, it combines aspects of ubiquitous computing[3] and traditional supercomputing.
CCC 10403108/97/06046511 $17.50
1997 by John Wiley & Sons, Ltd.
466
This article is concerned with the tools that might be used to construct ubiquitous
supercomputing systems and applications. We explain how a combination of the Java
programming language and two simple mechanisms the global pointer and remote service
request can be used to satisfy these requirements.
2. UBIQUITOUS SUPERCOMPUTING
We discuss the types of applications that might be constructed in a ubiquitous supercomputing system.
2.1. Smart instruments
The utility of many scientific instruments can be enhanced significantly by the use of computational techniques. For example, in the case of an imaging device, computers can be
used to enhance images, to annotate images with hints as to significant features, to locate
similar images, to provide comparisons of observation and theory, or to integrate information from several imaging modalities. Such techniques have been used to a limited extent
for some time; however, in general, only fairly limited computation could be performed
because it was not feasible to co-locate a high-end computer with the instrument. The advent of high-speed networks makes it feasible to use a single supercomputer to serve many
instruments, with the result that the computational power accessible to a single instrument
increases dramatically. Quasi-real-time computer-enhanced imaging becomes possible.
Lee et al.[4] have developed an interesting example of this type of application. The
instrument in question, a weather satellite, takes pictures at multiple wavelengths. Data
from the satellite are received at the ground station and passed over a wide area network
to a supercomputer, where they are enhanced by a cloud detection algorithm to obtain
three-dimensional images of cloud location. These images are then passed to a display
device, allowing scientists to browse the computer-enhanced images almost in real time.
2.2. Smart applications
Similar techniques can be used to enhance the utility of desktop applications. Currently,
these may have sophisticated user interfaces but perform relatively simple computations.
The ability to connect to substantially greater computing resources can allow desktop
applications to perform more demanding computations. For example, a future spreadsheet
might connect to a model of the U.S. economy when evaluating investment strategies, or to
a climate model when evaluating risk management strategies for an agricultural concern.
A system for preparing audiovisual presentations might reach over the network to search
massive image banks for pictures matching a specified textual description or might exploit
external computing resources to render a video clip.
Simple examples of this sort of tool have already been constructed. To name just two
examples, the Network Enabled Optimization System (NEOS) allows users to submit
optimization problems electronically to an optimization server, while NetSolve allows
desktop applications written in MatLab to pass computationally demanding tasks to highperformance computers. In both these cases, access to networked resources is far from
seamless; however, these systems are suggestive of how future smart applications might
work.
467
468
in various Web browsers, making it possible for users to create Web pages that perform
various computations.
While Java has significant advantages as a language for ubiquitous computing, it is deficient in the important area of communication. (Other significant shortcomings, for example,
in the security area, are beyond the scope of this article, which focuses on communication frameworks.) The Java library provides only basic support for communication using
low-level UDP and TCP protocols. The lack of higher-level communication mechanisms
greatly complicates the implementation of applications such as those described above.
We argue that communication facilities for Java should satisfy four basic requirements:
(i) asynchrony. While synchronous remote procedure call (RPC) is appropriate for
many distributed applications, particularly those with a clientserver structure, highperformance ubiquitous supercomputing applications also require mechanisms that
do not enforce synchronization between sender and receiver, such as asynchronous
remote function invocation and, in some cases, point-to-point communication (message passing).
(ii) symmetry. Clients (user Java programs) and servers (remote processes) need to
be able to be equal partners in a computation. Not only should a client be able to call
procedures in a server, but vice versa also.
(iii) global names. The ability to create references to objects and then communicate
those references between objects proves to be extremely useful in practice, making
it possible to create complex, distributed data structures and to write programs that
operate on these data structures in a uniform fashion, independently of object location.
Note that what is required here is a global name space, not a global address space.
(iv) high performance. We require techniques that permit high-performance implementations. This requirement means not only that our techniques should not introduce
performance bottlenecks, but that they should permit us to write programs that can
adapt their behavior to the often complex heterogeneous systems in which they can
be expected to operate.
As we explain in the next section, we propose to meet these requirements by developing
a Java binding for a communication library called Nexus that provides remote object
reference (called, in Nexus, a global pointer) and asynchronous remote method invocation
(in Nexus, remote service request) mechanisms.
4. NEXUS
Nexus is a communication library developed at Argonne National Laboratory and the
California Institute of Technology to support applications that require mechanisms for
asynchronous communication, multithreading and dynamic resource management in heterogeneous environments[8].
Nexus services provide direct support for lightweight threading, address space management, communication and synchronization. The Nexus interface is structured in terms of
five basic abstractions, illustrated in Figure 1: nodes, contexts, threads, global pointers and
remote service requests. A computation executes on a set of nodes and consists of a set of
threads, each executing in an address space called a context. For the purpose of this article,
it suffices to assume that a context is equivalent to a process and that a node is equivalent
to a particular computer.
469
A global pointer (GP) is a name that can refer to a memory location (or object) located
anywhere in a distributed system. GPs are used in conjunction with asynchronous remote
service requests (RSRs) to invoke actions at remote locations. An RSR takes a GP, a
procedure name and data, transfers the data to the context referenced by the GP, and
remotely invokes the specified procedure, providing the data and the local portion of the
GP as arguments. GPs can be passed as arguments to RSRs, hence allowing global names
to be propagated between processes.
Experience indicates that Nexus mechanisms can be implemented efficiently on a wide
range of parallel and networked computer systems[8]. Furthermore, global pointers can be
used as a basis for mechanisms that support both automatic and programmer-guided selection from among multiple communication methods[9]. These mechanisms allow programs
to execute efficiently in heterogeneous environments and make it possible to use different
communication protocols for different communication structures. Nexus has been used to
implement a variety of different parallel and distributed programming tools providing different interaction models, including remote procedure call (in CC++[10] and nPerl, an RPC
library for the Perl scripting language), multimedia streams (in CAVEcomm[11]) and message passing (the Message Passing Interface[12]). Nexus also serves as the communication
infrastructure for the Globus distributed computing infrastructure toolkit[13].
Nexus mechanisms satisfy each of the requirements introduced above. The RSR provides
an asynchronous communication substrate, on which can be layered a variety of more
sophisticated interaction methods. The global pointer makes it easy to specify symmetric
structures, since a client can easily pass a global pointer to a server, hence allowing the
server to invoke procedures in the client. Global pointers also provide a global name space.
Finally, Nexus mechanisms have been shown to permit high-performance implementations.
5. A JAVA BINDING FOR NEXUS
We have constructed a Java binding for Nexus; that is, an interface to Nexus mechanisms
that allows Java programs to create and exchange global pointers and to perform remote
service requests to methods defined in objects referenced by these global pointers. This
binding also allows Java programs to communicate with other programs (such as MPI or
one of the many parallel languages that support Nexus) that employ Nexus mechanisms.
The Java binding for Nexus implements just the Nexus global pointer and remote service
request mechanisms. Nexus also includes support for a set of thread management, condition
variables and mutual exclusion (mutex) functions; however, these functions need not be
implemented in the Java binding for Nexus, because the Java Thread class supports these
functions and the Java language itself provides support synchronization mechanisms at the
object and method levels.
As we shall explain, the Java binding provides direct access to the relatively low-level
Nexus interface; this interface can then be used to build higher-level Java communication
libraries for specific purposes.
We implement the Java binding as a Nexus-compatible library written entirely in Java.
This means that Nexus code can run within any system that incorporates a Java interpreter
or just-in-time compiler. The library comprises four basic classes: Nexus, which supports
initialization, argument handling, handler registration, global pointer creation, and attachment to other processes; GlobalPointer, which implements the Nexus global pointer
abstraction, for use in remote service requests; PutBuffer, which provides mechanisms
470
Context
Context
Context
int j;
int i;
GP
NODE
Figure 1.
GP
GP
int i;
NODE
for buffer packing; and GetBuffer, which provides buffer unpacking mechanisms. We
shall use a simple example to illustrate the use of the various functions defined in these
classes. (NexusJava function prototypes are generally equivalent to those of the Nexus C
library.)
Our example comprises the simple client and server programs in Figure 2. The client
performs a single remote service request to the server. The client terminates immediately
after generating the request, and the server terminates immediately after performing the
request. This trivial example does not really demonstrate the expressiveness of NexusJava,
but does have the pedagogical advantage of introducing most NexusJava features.
The client begins by instantiating and initializing a Nexus object. This must be done
before any other NexusJava operations are performed. The client then attaches to the server
public class ExampleClient {
private Nexus nexus;
public static void main (String args[]) {
ExampleClient n = new ExampleClient(); n.start(args);
}
public void start(String args[]) {
GlobalPointer gp;
nexus = new Nexus();
args = nexus.init(args, "nx", null);
try { gp = nexus.attach("x-nexus://cosmo.mcs.anl.gov:1234/");
call_server_handler(gp, 10);
gp.destroy();
} catch (Exception e) e.printStackTrace();
nexus.destroy_current_context(false);
}
public void call_server_handler(GlobalPointer gp, int i) {
PutBuffer buffer;
try { buffer = gp.init_remote_service_request("server_handler", 42);
buffer.set_buffer_size(buffer.sizeof_int(1), 1);
buffer.put_int(i);
buffer.send_remote_service_request();
} catch (Exception e) e.printStackTrace();
}
}
Figure 2.
Example: Client program that demonstrates initialization, packing a buffer and sending
an RSR
471
using the Nexus.attach() method. This method takes as its argument a URL specifying
the hostname and port on which the server is listening; it returns a GlobalPointer
referencing an object in the server process.
Once the client has attached to the server, it can use the GP to invoke methods
defined in the remote object that this pointer references. For example, the procedure
call server handler() invokes a remote procedure called server handler, passing as its argument the single integer 10. It calls low-level Nexus routines to (a) initiate
the remote service request, (b) construct a buffer containing the integer argument, and
(c) complete the RSR. The client then uses the GlobalPointer.destroy() method to
destroy the GP to the server; this action severs the connection between the client and server.
Finally, the client shuts down NexusJava by calling the destroy current context()
method on the Nexus object. This action cleanly terminates any threads and other states
that are maintained by this object.
The server program, like the client, first instantiates and initializes a Nexus object.
Then, it registers the set of handler names for which it will accept messages. The registration is performed by the routine register my handlers(), which creates an array
of Handler objects in which each element describes a handler. This description includes
the handler name (e.g. server handler), a handler id (e.g. 42), a flag specifying whether
this handler should be invoked in a newly created thread or in an existing thread, the
HandlerInterface object to call when an RSR arrives for this handler, and a local handler id that can be used for quick dispatch of the handler within that HandlerInterface
object. The Nexus.register handlers() method is then called with the Handler array
to inform the Nexus object of the handlers for which RSRs are to be accepted.
After registering the handlers, the server next calls Nexus.allow attach() to indicate
that it is prepared to accept incoming RSRs. It then suspends in wait for client, processing subsequent attachment or RSR requests as call backs. Attachment requests result
in calls to the attach approval() method in the AttachApprovalInterface object
passed as the second argument to allow attach(). The attach approval() method
returns a GP to a local object, which will be returned to the attacher. The server may also
decide to deny the attachment request, in which case it must return null.
RSR requests (for example, to server handler) cause the invoke handler() method
(part of the HandlerInterface provided by ExampleServer) to be called by NexusJava.
This method (a) uses the handler name, id, and local id to figure out which of this objects
methods should be invoked, (b) unpacks the GetBuffer to get the arguments for the
method, and (c) calls that method with the arguments.
As mentioned above, handlers can be either threaded or non-threaded. When an RSR
arrives for a threaded handler, a new Java thread is created by NexusJava, and the
invoke handler() method is called from within this new thread. There are no restrictions
on what this handler may do. NexusJava also supports a more efficient but restricted form
of handler invocation. If a handler is registered as non-threaded, NexusJava does not create
a new thread. Instead, it calls invoke handler() directly from its pre-existing, internal
communications thread. This approach avoids the cost of thread creation and switching
during handler dispatch. However, the user must guarantee that a handler registered as
non-threaded will not block (wait) on any operation that may require another RSR handler
invocation to unblock (notify) the first handler.
Once the server receives the RSR and calls the server handler() method, this method
will notify the main thread waiting in wait for client(). The server then disallows
472
Example: Server program that demonstrates handler registration, handler invocation and
buffer unpacking
473
6. HIGHER-LEVEL INTERFACES
As noted above, a wide variety of higher-level interaction models can be layered on top of
the low-level Nexus mechanisms. Here, we discuss techniques that can be used to implement
an RPC model. The basic idea is to use IDL-like techniques to generate automatically the
code responsible for registering handlers, marshaling arguments to remote method calls,
demarshaling arguments and dispatching method invocations. Similar techniques are used
in other systems, notably CC++[10] and CORBA[14].
Figures 2 and 3 illustrate what is involved. In the ExampleClient class, the
call server handler() method is essentially a stub that encapsulates the argument
marshaling and other bookkeeping required to perform a remote method invocation to the
server handler() in the ExampleServer. Similarly, in the ExampleServer class, the
invoke handler() method is essentially a stub that demarshals the arguments from the
buffer and calls the appropriate method (such as server handler()) locally.
These stub methods can be generated automatically in a number of different ways. The
CORBA approach could be followed, whereby a high-level Interface Definition Language
(IDL) is used to describe the methods by which one wishes to perform remote invocations.
An IDL compiler is then used to convert automatically this IDL specification into Java
stub code. A disadvantage of this approach is that the definition and compilation of explicit
interfaces can be rather complex. Since the Java source-to-bytecode compiler is implemented in Java, and since Java classes can be loaded on the fly, an intriguing alternative is
to generate the appropriate stubs on the fly when doing handler registration.
7. OTHER APPROACHES
The Java community has seen several recent attempts to provide higher-level communication in Java. The two most important (and interesting) are CORBA-based products by
several companies and JavaSofts Remote Method Invocation (RMI) package[15].
The Common Object Request Broker Architecture (CORBA) provides standard mechanisms for exporting objects for remote use, for locating remote objects, and for invoking
methods in remote objects. As mentioned above, objects export interfaces defined using an
IDL, which is compiled into language-specific stubs for use in remote method invocation.
IDL-to-Java mappings have been defined, and several companies have released an IDL
compiler. These products allow Java objects to communicate with other remote objects that
have been written in Java or another language.
The JavaSoft Remote Method Invocation (RMI) specification is similar in spirit to the
CORBA approach, with three significant differences. Firstly, it uses Java-specific interface
definitions instead of a language-neutral IDL specification to produce stub code. This is
a sensible design decision for an all-Java application focus, but hinders interoperability.
Secondly, RMI does not use standard CORBA methods for object location and method
invocation. However, once a reference to a remote object has been obtained, both Java
implementations of CORBA and RMI allow methods to be invoked on that object using
essentially the same syntax as normal, local Java method invocations. Thirdly, the RMI
specification defines a Java-specific framework for marshaling parameters between locations. This Object Serialization framework is tightly coupled with the compiler front-end.
Like RMI, it works well when the entire application is to be written in Java, but is not easily
integrated with other languages, such as C and C++.
474
CORBA and RMI mechanisms can be used to provide Nexus-like functionality, namely,
the abilities to obtain references to remote objects and to use those references to invoke
methods within those objects. The JavaSoft CORBA and RMI products are better integrated
into Java than NexusJava. However, they also have significant limitations. Neither CORBA
nor RMI supports the fully asynchronous operations provided in Nexus. CORBA does not
support the concept of a global pointer and hence cannot define a global name space. RMI
supports a remote object construct that has some similarities to the global pointer, but it is
Java-specific and does not support interfaces to other systems.
8. CONCLUSIONS
We have shown how the Nexus global pointer and remote service request mechanisms can
be incorporated into Java by defining appropriate Java classes. The resulting system makes
it possible to construct the extremely flexible communication structures enabled by Nexus,
without compromising the transportability of Java code. The techniques also support interoperability with other Nexus-based applications. Our next steps in this area will be to
experiment with the use of NexusJava for a range of ubiquitous supercomputing applications. We are also interested in developing higher-level interfaces to Nexus mechanisms by
using some of the techniques introduced above.
Our work on Nexus forms part of a larger project called Globus[13] that is developing
key infrastructure components for high-performance distributed computing. We expect
availability of NexusJava to increase significantly the range of applications for which
Globus services are useful.
For more information on the NexusJava project and the current software distribution,
see the Nexus home page http://www.mcs.anl.gov/nexus/. More information on the
Globus project can be found on the Globus home page http://www.globus.org/.
ACKNOWLEDGEMENTS
The Nexus library used to construct NexusJava has been developed jointly with Carl
Kesselman. This work was supported in part by the Mathematical, Information, and Computational Sciences Division subprogram of the Office of Computational and Technology
Research, U.S. Department of Energy, under contract W-31-109-Eng-38.
Thanks to Gail Pieper and Gregor von Laszewski for their assistance in preparing the
final manuscript.
REFERENCES
1. T. DeFanti, I. Foster, M. Papka, R. Stevens and T. Kuhfuss, Overview of the I-WAY: Wide area
visual supercomputing, Int. J. Supercomput. Appl., 10(2), 123130 (1996).
2. I. Foster, J. Geisler, W. Nickless, W. Smith and S. Tuecke, Software infrastructure for the
I-WAY high-performance distributed computing experiment, Proc. 5th IEEE Symp. on High
Performance Distributed Computing, IEEE Computer Society Press, 1996, pp. 562571.
3. Mark Weiser, Hot topics: Ubiquitous computing, IEEE Computer, 26(10), (1993).
4. C. Lee, C. Kesselman and S. Schwab, Near-realtime satellite image processing: Metacomputing
in CC++, Comp. Graph. Appl., 16(4), 7984 (1996).
5. Darin Diachin, Lori Freitag, Daniel Heath, James Herzog, William Michels and Paul Plassmann,
Remote engineering tools for the design of pollution control systems for commercial boilers,
Int. J. Supercomput. Appl., 10(2), (1996).
475