Distributed and High-Performance Computing
Distributed and High-Performance Computing
Paul Coddington Distributed & High Performance Computing Research Group Department of Computer Science University of Adelaide Room 1052 http://dhpc.adelaide.edu.au [email protected] July - October 2000
DISTRIBUTED COMPUTING I
DHPC
Distributed Computing I
DHPC
Distributed Computing I
[KAH/PDC,97-00]
DHPC
Distributed Computing I
DHPC
Distributed Computing I
Historical perspective
Work in distributed systems started with workstations Cheap microprocessors and workstations/PCs, modern OS (Unix and NT), and rise of LANs (e.g. Ethernet) led to replacement of mainframe model by distributed network of workstations Ideas from Xerox Palo Alto in early 1980s Idle cycles on workstations could be exploited via shared le system any process can run on any available system Signicant complications for the le system concurrency control problems XDFS rst implemented on Xerox D Series W/S Unique network wide le identiers (integers) allow retrieval of les from anywhere on network Mapping from human readable name to FID by directory server (itself a distributed application) System was transparent, but slow and fault intolerant Sun took core functionality of XDFS into NFS (1987) NFS evolved into de facto standard, fairly robust, ecient and transparent Very wide area transparent le store still a research area (WebFS, Globus, DWorFS, etc)
[KAH/PDC,97-00] DHPC CS7933, July-Oct 00, 5
DHPC
Distributed Computing I
DHPC
Distributed Computing I
NFS Layers
For NFS implementation, new representation v-nodes v-node triple: (computer-ID, FS number, i-node) Also allows support for foreign (i.e. non Unix) le systems Client/server model used to communicate across network Allows NFS to look like normal Unix le system to applications NFS is stateless - no server retains information about clients If client crashes, no eect on server If server crashes, client blocks until server returns Client computations delayed but not damaged (blocking communications) Stateless costs some performance NFS optimized for transferring lots of small blocks, so not optimal for bulk data transfer (e.g. in HPC applications), some research work is addressing this problem
[KAH/PDC,97-00]
DHPC
Distributed Computing I
NFS Problems
Consequence of statelessness is lack of le locking Normal Unix allows les to be locked for reading/writing In normal Unix kernel, le blocks are cached in kernel NFS speeds up operations by caching at both client and server ends This creates problems - clients may have inconsistent view of data Client reads from its own cache Some other client may have modied that part of the le already, i.e. le has been changed on server NFS approach is to request new copy if data is more than some number of seconds old (3?) This is costly and not very eective No locking means writes must be synchronous, so write operations complete only when server has written data Writes are therefore much slower than reads (which can be from cache ) NFS addresses problem as best as possible given the constraints - so we still use it.
[KAH/PDC,97-00]
DHPC
Distributed Computing I
Architecture Models
Three basic architectural models for distributed systems: workstations/servers model; processor pool (thin client) model; integrated model.
[KAH/PDC,97-00]
DHPC
Distributed Computing I
Workstation Model
Each user has a workstation Application programs run on the workstation Specialised servers perform designated services (e.g. le, directory, authentication, news, printing, gateway, mail, specialist processing) Workstations integrated by sharing common set of resources and common interface Usually user ID is unique across whole network of W/S and any user may use any W/S System wide lestore is mandatory Some W/S may in addition have private lestores - must be exported to allow transparent access from other machines User can also run application programs remotely on other workstations Cluster Management Systems (CMS) allow user to submit jobs transparently to the NOW, rather than have to manually choose a specic machine to run on CMS handles resource allocation, scheduling, queueing
[KAH/PDC,97-00]
DHPC
Distributed Computing I
DHPC
Distributed Computing I
[KAH/PDC,97-00]
DHPC
Distributed Computing I
Transparency
Would like access to services to be transparent, i.e. an application in a distributed system just has to request a service, but does not need to know where it is performed. Advantages are: Ease of programming application does not need to worry about specifying a particular server to perform the service. Redundancy and fault tolerance multiple servers can provide the same service, if one is down the application uses another. Eciency if multiple servers oer the same service, the application can use one which is less loaded and/or has faster network connection. Some issues in providing transparency: discovery how to nd remote services (or objects)? access how to access remote objects? (and distinguish between accessing local and remote objects)? failure how to maintain the service with some component failures? replication maintain multiple copies of objects, but must treat them as a single object (e.g. for updating)
[KAH/PDC,97-00] DHPC CS7933, July-Oct 00, 13
DHPC
Distributed Computing I
Location of Services
Location of services and service (or resource) discovery is a major problem for distributed systems In TCP/IP the notion of the well-known service this is eectively hardwired (using port numbers) More generally how do you nd the service you want? Require clients to request a given service, either from the OS or a broker Locate the service in a service registry (directory), which provides a mapping from service names to machines and programs that provide the service, and program interface Either directly (via OS) or by using a broker or a trader to access the registry DC system binds a particular server instance to the client Directory or registry services are hard to build properly for distributed systems potential problems with scalability, performance, uniqueness or names, etc Trader should ideally be able to allocate the best instance of the service (least loaded, highest bandwidth etc) for the client, but this is usually not done by default programmer has to sort it out
[KAH/PDC,97-00] DHPC CS7933, July-Oct 00, 14
DHPC
Distributed Computing I
Consistency
Behaviour of a distributed system must be predictable Hope that response is (almost) as fast and robust as standalone desktop system Failure modes - rest of system continues to operate when one part fails (an independent failure mode) Good in that can carry on using remaining system and do work Bad in that some services or part of a database might now be inaccessible (unless have redundancy and failover) User interface consistency - want same user interface present Response times need to be well managed - careful choices on responses and timeouts Want some idea of what is delay if more than a second for example Mouse movement must be smooth Screen update must be fast (transferring bitmaps for whole screen)
[KAH/PDC,97-00]
DHPC
Distributed Computing I
[KAH/PDC,97-00]
DHPC
Distributed Computing I
[KAH/PDC,97-00]
DHPC
Distributed Computing I
Latency
In a tightly coupled distributed memory parallel computer, communications latency is fairly constant In a loosely coupled network (e.g. a NOW), latency is usually higher and has greater variance Intrinsic latency comes from hardware, speed-of-light constraints, and message passing software overhead Standard transport protocols for unreliable networks (e.g. TCP/IP) have high latency (heavyweight) Latency is much more variable due to non-deterministic delivery times Latency in distributed computing is an even bigger overhead than for parallel computing hence usually have coarse-grained services Measuring Latency: usually measure by the return trip time (RTT) not safe to assume latency is xed and the same for all packets packets may take dierent routes through the network, especially over a WAN also network trac is bursty and this leads to variance in the return trip time
[KAH/PDC,97-00] DHPC CS7933, July-Oct 00, 18
DHPC
Distributed Computing I
Interprocess Communication
Distributed processes or tasks need to communicate For distributed computing we usually do not have shared memory, so need to use message passing method process A sends a message to process B process B receives it send/receive may be synchronous (A blocks until B receives the message) or asynchronous (some buering mechanism allows A to proceed as soon as it has sent data) Process A: Send( message, B ); Receive( reply, B ); ... Process B: Receive( message, A ); Send( reply, A ); ... simple ideas: send/receive, and some startup interrogation to nd out process identities, form basis for distributed and parallel computation mechanism pairing or receive and send together into a single unit forms a transaction
[KAH/PDC,97-00] DHPC CS7933, July-Oct 00, 19
DHPC
Distributed Computing I
DHPC
Distributed Computing I
[KAH/PDC,97-00]
DHPC
Distributed Computing I
RPC Programming
OReilly book Power Programming with RPC by Bloomer and exercises in RPC programming lab Web page Also man rpc or man rpcgen on Unix Consider client and server side for RPCs: Client needs a specication of the interface the server is exporting Normally written in an interface specication language (unless RPC is part of the programming language) Specication language is able to describe simple data types and structures Need a canonical representation - specify exactly and unambiguously Pass description of the server interface to interface compiler or stub generator (rpcgen) Stub generators synthesise language callable stubs which invoke code to perform the RPC hence transparency
[KAH/PDC,97-00]
DHPC
Distributed Computing I
For example - the client code: s = bind_to_service( the_service ); ... check for errors rpc_call( parameters, s ); ... check for errors First call locates a server to provide the appropriate service Implementing the binding is not trivial Stub generator may make the server instance transparent Typical RPC systems have the procedures return a status result - this can be used to indicate failure, timeouts,... For example the server code: definition of procedure s; begin server register( the_service, s ); ... check for errors end server; Stub generator registers the particular service Need a name for each procedure you register
[KAH/PDC,97-00] DHPC CS7933, July-Oct 00, 23
DHPC
Distributed Computing I
Names often just become numbers The remote procedure is simply coded to appropriate conventions The stub generator wraps a service routine in code to handle the RPC infrastructure Services are coded in the same way as any other procedure Namespace management and registry is a major problem RPC is the underlying technology for many distributed computing applications including NFS
[KAH/PDC,97-00]
DHPC
Distributed Computing I
In case 1 call has not been made In case 2 call has not been made In case 3 and 4 call may have been made...uncertainty is the problem At-least-once semantics for RPCs (client knows 1 successful calls have been made) At-most-once semantics ( 1 successful call made)
[KAH/PDC,97-00] DHPC CS7933, July-Oct 00, 25
DHPC
Distributed Computing I
[KAH/PDC,97-00]