Distributed Systems (Fall 2013) : Lec 4: Remote Procedure Calls (RPC)
Distributed Systems (Fall 2013) : Lec 4: Remote Procedure Calls (RPC)
[Fall 2013]
2
YFS Lab Series Background
3
YFS
• Instructional distributed file system developed by MIT after
a research distributed file system, called Frangipani
– Analogous to xv6 for OS courses
– When we discuss YFS, we really refer to Frangipani (or a
simplified version thereof)
– Thekkath, Chandramohan A., Timothy Mann, and Edward K. Lee.
"Frangipani: A scalable distributed file system." ACM SIGOPS
Operating Systems Review. Vol. 31. No. 5. ACM, 1997.
YFS Design Goals
• Aggregate many disks from
many servers
• Incrementally scalable
• Tolerates and recovers from
node, network, disk failures
Design
Design
Lock Service
• Resolve concurrent read/write issues
– If one client is writing to a file while another
client is reading it, inconsistency appears.
• Correctness:
– At most one lock is granted to any client
• Additional requirement:
– acquire() at client does not return until lock is granted
Lab 2 Steps
• Step 1: Checkout the skeleton code from ds-git
• Step 2: Implement server lock and client lock
– Test it using locker_tester
14
Last Time (Reminder/Quiz)
• Processes: A resource container for execution on a
single machine
• Threads: One “thread” of execution through code.
Can have multiple threads per process.
• Why processes? Why threads? Why either?
• Local communication
– Inter-process communication
– Thread synchronization
15
Today: Distributed Communication
• Socket communication
• Remote Procedure Calls (RPCs)
• RPC challenges
16
Common Communication Pattern
working {
Done/Result
Communication Mechanisms
• Many possibilities and protocols for communicating in
a distributed system
– Sockets (mostly in HW1)
– RPC (today)
– Distributed shared memory (possibly later classes)
– Map/Reduce, Dryad (later classes)
– MPI (on your own)
18
Socket Communication
• You your own protocol on top of a transmission
protocol (e.g., TCP or UDP)
19
Socket Communication
• TCP (Transmission Control Protocol)
– Protocol built upon the IP networking protocol, which
supports sequenced, reliable, two-way transmission over a
connection (or session, stream) between two sockets
• Use:
– TCP when you need reliability, but when performance of setting up
connection is not a huge (e.g., file transmission, a lock service)
– UDP when it’s OK to lose, re-order, or duplicate messages, but you
want low latency (e.g., online games, messaging, games)
20
Socket API Overview
Client Server
int socket(domain, type,
socket socket protocol)
bind(server_sock,
bind &server_address,
connect(client_sock, server_len)
&server_addr,
server_len) listen listen(server_sock,
Connection backlog)
request
connect accept accept(server_sock,
&client_addr,
&client_len)
Client / write read
Server
Session
read write
EOF
close read
close
Complexities of Using the Socket API
• Lots of boiler-plate when using a raw socket API
• Lots of bugs/inefficiencies if you’re not careful
– E.g.: retransmissions, multi-threading, …
• Plus, you have to invent the data transmission protocol
– Can be complex struct foomsg {
u_int32_t len;
– Hard to maintain }
Client:
{ ...
resp = foo(“hello”);
}
Server:
int foo(char* arg) {
…
}
RPC Goals
• Ease of programming
– Familiar model for programmers (just make a function call)
• Hide complexity (or some of it – we’ll see later)
• Automate a lot of task of implementing
• Standardize some low-level data packaging protocols
across components
Historical note: Seems obvious in retrospect, but RPC was only invented in the
‘80s. See Birrell & Nelson, “Implementing Remote Procedure Call” ... or
Bruce Nelson, Ph.D. Thesis, Carnegie Mellon University: Remote Procedure
Call., 1981 :)
RPC Architecture Overview
wait work
Client: Server:
{ ... int foo(char* arg) {
resp = foo(“hello”); …
} }
Why Marshaling?
• Calling and called procedures run on different
machines, with different address spaces
– Therefore, pointers are meaningless
– Plus, perhaps different environments, different operating
systems, different machine organizations, …
– E.g.: the endian problem:
• If I send a request to transfer $1 from my little-endian machine,
the server might try to transfer $16M if it’s a big-endian machine
struct fooargs {
string msg<255>;
int baz;
}
And Describes Functions
program FOOPROG {
version VERSION {
void FOO(fooargs) = 1;
void BAR(barargs) = 2;
} = 1;
} = 9999;
More requirements
• Provide reliable transmission (or indicate failure)
– May have a “runtime” that handles this
• At-most-once
– Use a sequence # to ensure idempotency against
network retransmissions
– and remember it at the server
At-least-once versus at-most-once?
let's take an example: acquiring a lock
if client and server stay up, client receives lock
if client fails, it may have the lock or not (server
needs a plan!)
if server fails, client may have lock or not
at-least-once: client keeps trying
at-most-once: client will receive an exception
what does a client do in the case of an exception?
need to implement some application-specific protocol
ask server, do i have the lock?
server needs to have a plan for remembering state
across reboots
e.g., store locks on disk.
at-least-once (if we never give up)
clients keep trying. server may run procedure several
times
server must use application state to handle duplicates
if requests are not idempotent
but difficult to make all request idempotent
e.g., server good store on disk who has lock and req id
check table for each requst
even if server fails and reboots, we get correct
semantics
What is right?
depends where RPC is used.
simple applications:
at-most-once is cool (more like procedure calls)
more sophisticated applications:
need comparison
an application-level plan
from Kaashoek, innotes
6.842 both cases
Implementing at-most-once
• At-least-once: Just keep retrying on client side until you get a
response.
– Server just processes requests as normal, doesn’t remember
anything. Simple!
• Zero-copy tricks:
– Representation: Send on the wire in native format and
indicate that format with a bit/byte beforehand. What
does this do? Think about sending uint32 between two
little-endian machines
Next Time
• A bunch of RPC library examples
• With code!
43