0% found this document useful (0 votes)
0 views

11 Distributed File Systems

The document discusses distributed file systems, focusing on their architecture, operations, and challenges such as consistency, replication, and fault tolerance. It highlights the NFS architecture and compares versions, as well as introduces concepts like client-side caching and the Google File System's chunk-based approach. Additionally, it addresses high availability in peer-to-peer systems and the trade-offs between replication and erasure coding for fault tolerance.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

11 Distributed File Systems

The document discusses distributed file systems, focusing on their architecture, operations, and challenges such as consistency, replication, and fault tolerance. It highlights the NFS architecture and compares versions, as well as introduces concepts like client-side caching and the Google File System's chunk-based approach. Additionally, it addresses high availability in peer-to-peer systems and the trade-offs between replication and erasure coding for fault tolerance.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Distributed Systems

Principles and Paradigms

Chapter 11
(version October 15, 2007)

Maarten van Steen


Vrije Universiteit Amsterdam, Faculty of Science
Dept. Mathematics and Computer Science
Room R4.20. Tel: (020) 598 7784
E-mail:[email protected], URL: www.cs.vu.nl/∼steen/
01 Introduction
02 Architectures
03 Processes
04 Communication
05 Naming
06 Synchronization
07 Consistency and Replication
08 Fault Tolerance
09 Security
10 Distributed Object-Based Systems
11 Distributed File Systems
12 Distributed Web-Based Systems
13 Distributed Coordination-Based Systems
00 – 1 /
Distributed File Systems

General goal: Try to make a file system transparently


available to remote clients.

Client Server

Requests from
client to access File stays
remote file on server

Remote access model

1. File moved to client


Client Server

Old file

New file

2. Accesses are
3. When client is done,
done on client
file is returned to
server

Upload/download model

11 – 1 Distributed File Systems/11.1 Architecture


Example: NFS Architecture
NFS is implemented using the Virtual File System
abstraction, which is now used for lots of different op-
erating systems:
Client Server

System call layer System call layer

Virtual file system Virtual file system


(VFS) layer (VFS) layer

Local file Local file


system interface NFS client NFS server system interface

RPC client RPC server


stub stub

Network

Essence: VFS provides standard file system inter-


face, and allows to hide difference between accessing
local or remote file system.

Question: Is NFS actually a file system?


11 – 2 Distributed File Systems/11.1 Architecture
NFS File Operations
Oper. v3 v4 Description
Create Yes No Create a regular file
Create No Yes Create a nonregular file
Link Yes Yes Create a hard link to a file
Symlink Yes No Create a symbolic link to a file
Mkdir Yes No Create a subdirectory
Mknod Yes No Create a special file
Rename Yes Yes Change the name of a file
Remove Yes Yes Remove a file from a file system
Rmdir Yes No Remove an empty subdirectory
Open No Yes Open a file
Close No Yes Close a file
Lookup Yes Yes Look up a file by means of a name
Readdir Yes Yes Read the entries in a directory
Readlink Yes Yes Read the path name in a symbolic link
Getattr Yes Yes Get the attribute values for a file
Setattr Yes Yes Set one or more file-attribute values
Read Yes Yes Read the data contained in a file
Write Yes Yes Write data to a file

Question: Anything unusual between v3 and v4?

11 – 3 Distributed File Systems/11.1 Architecture


Cluster-Based File Systems

Observation: When dealing with very large data col-


lections, following a simple client-server approach is
not going to work.

Solution 1: For speeding up file accesses, apply


striping techniques by which files can be fetched in
parallel:
File block of file a File block of file e

a b c d e
a b c d e
a b c d e

Whole-file distribution

a b a b a b
c e c d c d
d e e

File-striped system

11 – 4 Distributed File Systems/11.1 Architecture


Example: Google File System

Solution 2: Divide files in large 64 MB chunks, and


distribute/replicate chunks across many servers.

file name, chunk index


GFS client Master
contact address

Instructions Chunk-server state

Chunk ID, range


Chunk server Chunk server Chunk server
Chunk data
Linux file Linux file Linux file
system system system

A couple of important details:

• The master maintains only a (file name, chunk


server) table in main memory ⇒ minimal I/O
• Files are replicated using a primary-backup scheme;
the master is kept out of the loop

11 – 5 Distributed File Systems/11.1 Architecture


RPCs in File Systems

Observation: Many (traditional) distributed file sys-


tems deploy remote procedure calls to access files.
When wide-area networks need to be crossed, alter-
natives need to be exploited:

Client Server Client Server


LOOKUP
OPEN
LOOKUP READ

Lookup name Lookup name

Open file
READ
Read file data
Read file data
Time Time

(a) (b)

11 – 6 Distributed File Systems/11.3 Communication


Example: RPCs in Coda

Observation: When dealing with replicated files, se-


quentially sending information is not the way to go:

Client Client

Invalidate Reply Invalidate Reply

Server Server

Invalidate Reply Invalidate Reply

Client Client
Time Time
(a) (b)

Note: In Coda, clients can cache files, but will be in-


formed when an update has been performed.

11 – 7 Distributed File Systems/11.3 Communication


File Sharing Semantics (1/2)
Problem: When dealing with distributed file systems,
we need to take into account the ordering of concur-
rent read/write operations, and expected semantics (=
consistency).

Client machine #1

a b
Process
A
a b c

2. Write "c" 1. Read "ab"

File server
Original file
Single machine a b

a b
Process
A 3. Read gets "ab"
a b c
Client machine #2

Process
a b
B
Process
B
1. Write "c" 2. Read gets "abc"

(a) (b)

11 – 8 Distributed File Systems/11.5 Synchronization


File Sharing Semantics (2/2)

UNIX semantics: a read operation returns the effect


of the last write operation ⇒ can only be imple-
mented for remote access models in which there
is only a single copy of the file

Transaction semantics: the file system supports trans-


actions on a single file ⇒ issue is how to allow
concurrent access to a physically distributed file

Session semantics: the effects of read and write


operations are seen only by the client that has
opened (a local copy) of the file ⇒ what happens
when a file is closed (only one client may actually
win)

11 – 9 Distributed File Systems/11.5 Synchronization


Example: File Sharing in Coda

Essence: Coda assumes transactional semantics, but


without the full-fledged capabilities of real transactions.

Session S A
Client

Open(RD) File f Invalidate


Close
Server

Close
Open(WR) File f

Client

Time
Session S B

Note: Transactional issues reappear in the form of


“this ordering could have taken place.”

11 – 10 Distributed File Systems/11.5 Synchronization


Consistency and Replication

Observation: In modern distributed file systems, client-


side caching is the preferred technique for attaining
performance; server-side replication is done for fault
tolerance.

Observation: Clients are allowed to keep (large parts


of) a file, and will be notified when control is with-
drawn ⇒ servers are now generally stateful

1. Client asks for file


Client Server
2. Server delegates file
Old file

Local copy 3. Server recalls delegation

Updated file
4. Client sends returns file

11 – 11 Distributed File Systems/11.6 Consistency and Replication


Example:
Client-side Caching in Coda

Session S A Session SA
Client A
Open(RD) Close Close
Open(RD)
Invalidate
Server File f (callback break) File f

File f OK (no file transfer)

Open(WR)
Open(WR) Close Close
Client B
Time
Session S B Session S B

Note: By making use of transactional semantics, it


becomes possible to further improve performance.

11 – 12 Distributed File Systems/11.6 Consistency and Replication


Fault Tolerance

Observation: FT is handled by simply replicating file


servers, generally using a standard primary-backup
protocol:

Client Client
Primary server
for item x Backup server
W1 W5 R1 R2

W4 W4

W3 W3 Data store

W2 W3
W4

W1. Write request R1. Read request


W2. Forward request to primary R2. Response to read
W3. Tell backups to update
W4. Acknowledge update
W5. Acknowledge write completed

11 – 13 Distributed File Systems/11.7 Fault Tolerance


High Availability in P2P Systems

Problem: There are many fully decentralized file-sharing


systems, but because churn is high (i.e., nodes come
and go all the time), we may face an availability prob-
lem.

Solution: Replicate files all over the place (replica-


tion factor: rrep).

Alternative: Apply erasure coding:

• Partition a file F into m fragments, and recode into


a collection F ∗ of n > m fragments

• Property: any m fragments from F∗ are sufficient


to reconstruct F.

• Replication factor: rec = n/m

11 – 14 Distributed File Systems/11.7 Fault Tolerance


Replication vs. Erasure Coding

With an average node availability a, and required file


unavailability ǫ, we have for erasure coding:
rec ·m  
rec · m i
1−ǫ= ∑ a (1 − a)rec ·m−i
i=m
i

and for file replication:

1 − ǫ = 1 − (1 − a)rrep

2.2
rreq
rec 2.0

1.8

1.6

1.4

0.2 0.4 0.6 0.8 1


Node availability

11 – 15 Distributed File Systems/11.7 Fault Tolerance

You might also like