Distributed Systems Research
Distributed Systems Research
distributed system software. This software enables computers to coordinate their activities and to share the resources of the system hardware, software, and data. Users of a distributed system should perceive a single, integrated computing facility even though it may be implemented by many computers in different locations. This is in contrast to a network, where the user is aware that there are several machines whose locations, storage replications, load balancing, and functionality are not transparent. Benefits of distributed systems include bridging geographic distances, improving performance and availability, maintaining autonomy, reducing cost, and allowing for interaction. The object-oriented model for a distributed system is based on the model supported by object-oriented programming languages. Distributed object systems generally provide remote method invocation (RMI) in an object-oriented programming language together with operating systems support for object sharing and persistence. Remote procedure calls, which are used in client-server communication, are replaced by remote method invocation in distributed object systems. The state of an object consists of the values of its instance variables. In the objectoriented paradigm, the state of a program is partitioned into separate parts, each of which is associated with an object. Since object-based programs are logically partitioned, the physical distribution of objects into different processes or computers in a distributed system is a natural extension. The Object Management Group's Common Object Request Broker (CORBA) is a widely used standard for distributed object systems. Other object management systems include the Open Software Foundation's Distributed Computing Environment (DCE) and Microsoft's Distributed Common Object Manager (DCOM). CORBA specifies a system that provides interoperability among objects in a heterogeneous, distributed environment in a way that is transparent to the programmer. Its design is based on the Object Management Group's object model. This model defines common object semantics for specifying the externally visible characteristics of objects in a standard and implementation-independent way. In this model, clients request services from objects (which will also be called servers) through a well-defined interface. This interface is specified in Object Management Group Interface Definition Language (IDL). The request is an event, and it carries
information including an operation, the object reference of the service provider, and actual parameters (if any). The object reference is a name that defines an object reliably. The central component of CORBA is the object request broker (ORB). It encompasses the entire communication infrastructure necessary to identify and locate objects, handle connection management, and deliver data. In general, the object request broker is not required to be a single component; it is simply defined by its interfaces. The core is the most crucial part of the object request broker; it is responsible for communication of requests. The basic functionality provided by the object request broker consists of passing the requests from clients to the object implementations on which they are invoked. In order to make a request, the client can communicate with the ORB core through the Interface Definition Language stub or through the dynamic invocation interface (DII). The stub represents the mapping between the language of implementation of the client and the ORB core. Thus the client can be written in any language as long as the implementation of the object request broker supports this mapping. The ORB core then transfers the request to the object implementation which receives the request as an up-call through either an Interface Definition Language (IDL) skeleton (which represents the object interface at the server side and works with the client stub) or a dynamic skeleton (a skeleton with multiple interfaces). Many different ORB products are currently available; this diversity is very wholesome since it allows the vendors to gear their products toward the specific needs of their operational environment. It also creates the need for different object request brokers to interoperate. Furthermore, there are distributed and client-server systems that are not CORBA-compliant, and there is a growing need to provide interoperability between those systems and CORBA. In order to answer these needs, the Object Management Group has formulated the ORB interoperability architecture. The interoperability approaches can be divided into mediated and immediate bridging. With mediated bridging, interacting elements of one domain are transformed at the boundary of each domain between the internal form specific to this domain and some other form mutually agreed on by the domains. This common form could be either standard (specified by the Object Management Group, for example, Internet Inter-ORB Protocol or IIOP), or a private agreement between the two parties. With immediate bridging, elements of interaction are transformed directly between the internal form of one domain and the other. The
second solution has the potential to be much faster, but is the less general one; it therefore should be possible to use both. Furthermore, if the mediation is internal to one execution environment (for example, TCP/IP), it is known as a full bridge; otherwise, if the execution environment of one object request broker is different from the common protocol, each object request broker is said to be a half bridge. Communication in Distributed Systems Message Oriented Communication - Message-Oriented Transient Communication - Message-Oriented Persistent Communication Stream-Oriented Communications RPC/RMI Protocols
Remote Procedure Call Steps of a Remote Procedure Call 1. Client procedure calls client stub in normal way
2. Client stub builds message, calls local OS 3. Client's OS sends message to remote OS 4. Remote OS gives message to server stub 5. Server stub unpacks parameters, calls server 6. Server does work, returns result to the stub 7. Server stub packs it in message, calls local OS 8. Server's OS sends message to client's OS 9. Client's OS gives message to client stub 10.Stub unpacks result, returns to client
Five different classes of failures in RPC systems: Client Cannot Locate the Server Server might be down or New server with old client
return -1 for error is not good enough because the return value might be-1 (e.g. adding 7 to -8) raise an exception (so write your own SIGNOSERVER) Lost Request Messages easiest to deal with, just have the kernel start a timer when sending the request, and resend it if time-out. Lost Reply Messages use timer again, but this time you cannot tell whether the reply is lost or the server is just slow idempotent transactions reading the first 1024 byte of a file is idempotent; transferring 1 million $ from a bank account is non-idempotent solution 1: construct all requests in an idempotent way solution 2: have the clients kernels assigned each request a seq. number Server Crashes
In the 2nd case, the system has to report failure back to the client In the 3rd case, it can just retransmit the request at least-once semantics at most-once semantics exactly once semantics Client Crashes
- unwanted computation is called orphan - solution 1: extermination --keeping log for every request
- Solution 2: reincarnation --divide time into sequentially numbered epochs. Client broadcasts a message to all machines declaring a new epoch. All remote computations are killed. - solution 3: gentle reincarnation --same as 2, but remote computations are killed only if they cannot find their owner - Solution 4: expiration --each RPC is given a standard time T to do a job, if it cannot finish within T, it have to ask the for another quantum. After the client crash, it waits for T unit of time to make sure that all orphans are gone, then it reboots.