0% found this document useful (0 votes)
2 views

10 Distributed File Systems

This document discusses distributed file systems, focusing on Sun NFS and Coda. It covers key concepts such as communication, naming, caching, replication, and fault tolerance within these systems. The document also highlights the differences between NFS versions and the architecture of Coda, emphasizing their functionalities and challenges in distributed environments.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

10 Distributed File Systems

This document discusses distributed file systems, focusing on Sun NFS and Coda. It covers key concepts such as communication, naming, caching, replication, and fault tolerance within these systems. The document also highlights the differences between NFS versions and the architecture of Coda, emphasizing their functionalities and challenges in distributed environments.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Distributed Systems

Principles and Paradigms

Chapter 10
(version 27th November 2001)

Maarten van Steen


Vrije Universiteit Amsterdam, Faculty of Science
Dept. Mathematics and Computer Science
Room R4.20. Tel: (020) 444 7784
E-mail:[email protected], URL: www.cs.vu.nl/ steen/

01 Introduction
02 Communication
03 Processes
04 Naming
05 Synchronization
06 Consistency and Replication
07 Fault Tolerance
08 Security
09 Distributed Object-Based Systems
10 Distributed File Systems
11 Distributed Document-Based Systems
12 Distributed Coordination-Based Systems
00 – 1 /
Distributed File Systems

Sun NFS

Coda

10 – 1 Distributed File Systems/


Sun NFS

Sun Network File System: Now version 3, version 4


is coming up.

Basic model: Remote file service: try to make a file


system transparently available to remote clients.

Follows remote access model (a)


instead of upload/download model (b):

1. File moved to client


Client Server Client Server

Old file

New file

Requests from
client to access File stays 2. Accesses are
3. When client is done,
remote file on server done on client
file is returned to
server

(a) (b)

10 – 2 Distributed File Systems/10.1 NFS


NFS Architecture
NFS is implemented using the Virtual File System
abstraction, which is now used for lots of different op-
erating systems:

Client Server

System call layer System call layer

Virtual file system Virtual file system


(VFS) layer (VFS) layer

Local file Local file


system interface NFS client NFS server system interface

RPC client RPC server


stub stub

Network

Essence: VFS provides standard file system inter-


face, and allows to hide difference between accessing
local or remote file system.

Question: Is NFS actually a file system?


10 – 3 Distributed File Systems/10.1 NFS
NFS File Operations
Oper. v3 v4 Description
Create Yes No Create a regular file
Create No Yes Create a nonregular file
Link Yes Yes Create a hard link to a file
Symlink Yes No Create a symbolic link to a file
Mkdir Yes No Create a subdirectory
Mknod Yes No Create a special file
Rename Yes Yes Change the name of a file
Remove Yes Yes Remove a file from a file system
Rmdir Yes No Remove an empty subdirectory
Open No Yes Open a file
Close No Yes Close a file
Lookup Yes Yes Look up a file by means of a name
Readdir Yes Yes Read the entries in a directory
Readlink Yes Yes Read the path name in a symbolic link
Getattr Yes Yes Get the attribute values for a file
Setattr Yes Yes Set one or more file-attribute values
Read Yes Yes Read the data contained in a file
Write Yes Yes Write data to a file

Question: Anything unusual between v3 and v4?

10 – 4 Distributed File Systems/10.1 NFS


Communication in NFS
Essence: All communication is based on the (best-
effort) Open Network Computing RPC (ONC RPC).
Version 4 now also supports compound procedures:

Client Server Client Server


LOOKUP
OPEN
LOOKUP READ

Lookup name Lookup name

Open file
READ
Read file data
Read file data
Time Time

(a) (b)

(a) Normal RPC

(b) Compound RPC: first failure breaks execution of


rest of the RPC

Question: What’s the use of compound RPCs?


10 – 5 Distributed File Systems/10.1 NFS
Naming in NFS (1/2)

Essence: NFS provides support for mounting remote


file systems (and even directories) into a client’s local
name space:

Client A Server Client B

remote bin users work bin

vu steen me

mbox mbox mbox

Exported directory Exported directory


mounted by client mounted by client
Network

Watch it: Different clients may have different local


name spaces. This may make file sharing extremely
difficult (Why?).

Question: What are the solutions to this problem?


10 – 6 Distributed File Systems/10.1 NFS
Naming in NFS (2/2)

Note: A server cannot export an imported directory.


The client must mount the server-imported directory:

Exported directory
contains imported
subdirectory
Client Server A Server B

bin packages
Client
imports
directory
draw from draw
server A Server A
imports
directory
install install from install
server B

Network
Client needs to
explicitly import
subdirectory from
server B

10 – 7 Distributed File Systems/10.1 NFS


Automounting in NFS
Problem: To share files, we partly standardize local
name spaces and mount shared directories. Mount-
ing very large directories (e.g., all subdirectories in
  
) takes a lot of time (Why?).

Solution: Mount on demand — automounting

Client machine Server machine

1. Lookup "/home/alice"
users
3. Mount request
NFS client Automounter
alice
2. Create subdir "alice"

Local file system interface

home

alice

4. Mount subdir "alice"


from server

Question: What’s the main drawback of having the


automounter in the loop?
10 – 8 Distributed File Systems/10.1 NFS
File Sharing Semantics (1/2)
Problem: When dealing with distributed file systems,
we need to take into account the ordering of concur-
rent read/write operations, and expected semantics (=
consistency).

Client machine #1

a b
Process
A
a b c

2. Write "c" 1. Read "ab"

File server
Original file
Single machine a b

a b
Process
A 3. Read gets "ab"
a b c
Client machine #2

Process
a b
B
Process
B
1. Write "c" 2. Read gets "abc"

(a) (b)

10 – 9 Distributed File Systems/10.1 NFS


File Sharing Semantics (2/2)

  
UNIX semantics: a operation returns the effect
 
of the last  operation can only be imple-
mented for remote access models in which there
is only a single copy of the file

Transaction semantics: the file system supports trans-


actions on a single file issue is how to allow
concurrent access to a physically distributed file
     
Session semantics: the effects of and  op-
erations are seen only to the client that has opened
(a local copy) of the file what happens when a
file is closed (only one client may actually win)

10 – 10 Distributed File Systems/10.1 NFS


File Locking in NFS

Observation: It could have been simple, but it isn’t.


NFS supports an explicit locking protocol (stateful),
but also an implicit share reservation approach:
Current denial state
None Read Write Both
Request Read OK Fail OK Fail
access Write OK OK Fail Fail
Both OK Fail Fail Fail

Requested denial state


None Read Write Both
Current Read OK Fail OK Fail
access Write OK OK Fail Fail
state Both OK Fail Fail Fail

Question: What’s the use of these share reserva-


tions?

10 – 11 Distributed File Systems/10.1 NFS


Caching & Replication

Essence: Clients are on their own.

Open delegation: Server will explicitly permit a client


machine to handle local operations from other clients
on that machine. Good for performance. Does require
that the server can take over when necessary:

1. Client asks for file


Client Server
2. Server delegates file
Old file

Local copy 3. Server recalls delegation

Updated file
4. Client sends returns file

Question: Would this scheme fit into v3

Question: What kind of file access model are we


dealing with?

10 – 12 Distributed File Systems/10.1 NFS


Fault Tolerance
Important: Until v4, fault tolerance was easy due to
the stateless servers. Now, problems come from the
use of an unreliable RPC mechanism, but also stateful
servers that have delegated matters to clients.

RPC: Cannot detect duplicates. Solution: use a duplicate-


request cache:

Client Server Client Server Client Server


XID = 1234 XID = 1234 XID = 1234

XID = 1234
process
request
XID = 1234 reply is lost
Cache Cache Cache
reply
XID = 1234
Time Time Time

(a) (b) (c)

Locking/Open delegation: Essentially, recovered server


offers clients a grace period to reclaim locks. When
period is over, the server starts its normal local man-
ager function again.
10 – 13 Distributed File Systems/10.1 NFS
Security
Essence: Set up a secure RPC channel between
client and server:

Secure NFS: Use Diffie-Hellman key exchange to set


up a secure channel. However, it uses only 192-
bit keys, which have shown to be easy to break.

RPCSEC GSS: A standard interface that allows inte-


gration with existing security services:
Client machine Server machine

NFS client NFS server

RPC client stub RPC server stub

RPCSEC_GSS RPCSEC_GSS

GSS-API GSS-API
Kerberos

Kerberos
LIPKEY

LIPKEY
Other

Other

Network

10 – 14 Distributed File Systems/10.1 NFS


Coda File System

Developed in the 90s as a descendant of the An-


drew File System (CMU)

Now shipped with Linux distributions (after 10 years!)

Emphasis: support for mobile computing, in par-


ticular disconnected operation.

Transparent access
to a Vice file server

Virtue
client

Vice file
server

10 – 15 Distributed File Systems/10.2 Coda


Coda Architecture
Virtue client machine

User User Venus


process process process

RPC client
stub

Local file
Virtual file system layer
system interface

Local OS

Network

Note: The core of the client machine is the Venus


process (yes, it’s a nasty beast). Note that most stuff
is at user level.

10 – 16 Distributed File Systems/10.2 Coda


Communication in Coda (1/2)

Essence: All client-server communication (and server-


server communication) is handled by means of a re-
liable RPC subsystem. Coda RPC supports side ef-
fects:

Client
Server
application

Application-specific
RPC Client protocol Server
side effect side effect

RPC client RPC protocol RPC server


stub stub

Note: side effects allows for separate protocol to han-


dle, e.g., multimedia streams.

10 – 17 Distributed File Systems/10.2 Coda


Communication in Coda (2/2)

Issue: Coda servers allow clients to cache whole files.


Modifications by other clients are notified through in-
validation messages there is a need for multicast
RPC:

Client Client

Invalidate Reply Invalidate Reply

Server Server

Invalidate Reply Invalidate Reply

Client Client
Time Time
(a) (b)

(a) Sequential RPCs

(b) Multicast RPCs

Question: Why do multi RPCs really help?


10 – 18 Distributed File Systems/10.2 Coda
Naming in Coda

Essence: Similar remote mounting mechanism as in


NFS, except that there is a shared name space be-
tween all clients:

Naming inherited from server's name space


Client A Server Client B

afs local afs


bin pkg

bin pkg

Exported directory Exported directory


mounted by client mounted by client
Network

10 – 19 Distributed File Systems/10.2 Coda


File Handles in Coda
Background: Coda assumes that files may be repli-
cated between servers. Issue becomes to track a file
in a location-transparent way:

Volume
replication DB RVID File handle

File server
VID1,
VID2
Server File handle

Server1

Server2 File server


Volume
location DB
Server File handle

Files are contained in a volume (cf. to UNIX file


system on disk)
Volumes have a Replicated Volume Identifier (RVID)
Volumes may be replicated; physical volume has
a VID

10 – 20 Distributed File Systems/10.2 Coda


File Sharing Semantics in Coda

Essence: Coda assumes transactional semantics, but


without the full-fledged capabilities of real transactions.

Session S A
Client

Open(RD) File f Invalidate


Close
Server

Close
Open(WR) File f

Client

Time
Session S B

Note: Transactional issues reappear in the form of


“this ordering could have taken place.”

10 – 21 Distributed File Systems/10.2 Coda


Caching in Coda

Essence: Combined with the transactional seman-


tics, we obtain flexibility when it comes to letting clients
operate on local copies:

Session S A Session SA
Client A
Open(RD) Close Close
Open(RD)
Invalidate
Server File f (callback break) File f

File f OK (no file transfer)

Open(WR)
Open(WR) Close Close
Client B
Time
Session S B Session S B

Note: A writer can continue to work on its local copy;


a reader will have to get a fresh copy on the next open.

Question: Would it be OK if the reader continued to


use its own local copy?

10 – 22 Distributed File Systems/10.2 Coda


Server Replication in Coda (1/2)

Essence: Coda uses ROWA for server replication:

Files are grouped into volumes (cf. traditional UNIX


file system)

Collection of servers replicating the same volume


form that volume’s Volume Storage Group)

Writes are propagated to a file’s VSG

Reads are done from one server in a file’s VSG

Problem: what to do when the VSG partitions and


partition is later healed?

Server Server
S1 S3

Client Broken Client


Server
A network B
S2

10 – 23 Distributed File Systems/10.2 Coda


Server Replication in Coda (2/2)

Solution: Detect inconsistencies using version vec-


tors:

CVVi f  j  k means that server Si knows that


server S j has seen version k of file f .

When a client reads file f from server Si, it re-


ceives CVVi f  .

Updates are multicast to all reachable servers (client’s


accessible VSG), which increment their CVVi f  i .

When the partition is restored, comparison of ver-


sion vectors will allow detection of conflicts and
possible reconciliation.

Note: the client informs a server about the servers in


the AVSG where the update has also taken place.

10 – 24 Distributed File Systems/10.2 Coda


Fault Tolerance
Note: Coda achieves high availability through client-
side caching and server replication

Disconnected operation: When a client is no longer


connected to one of the servers, it may continue with
the copies of files that it has cached. Requires that
the cache is properly filled (hoarding).

Compute a priority for each file


Bring the user’s cache into equilibrium (hoard walk):

– There is no uncached file with higher priority


than a cached file
– The cache is full or no uncached file has nonzero
priority
– Each cached file is a copy of a file maintained
by the client’s AVSG

Note: Disconnected operation works best when there


is hardly any write-sharing .
10 – 25 Distributed File Systems/10.2 Coda
Security

Essence: All communication is based on a secure


RPC mechanism that uses secret keys. When logging
into the system, a client receives:

A clear token CT from an AS (containing a gener-


ated shared secret key KS ). CT has time-limited
validity.

A secret token ST  Kvice  CT Kvice  , which is an


encrypted and cryptographically sealed version of
CT .

1
ST, K S (RA )

2
K S (RA +1, R B )
Alice

Bob

3
K S (RB +1)

4
K S (K S2 )

10 – 26 Distributed File Systems/10.2 Coda

You might also like