Week5 Dfs

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Distributed File Systems

File Characteristics
From Andrew File System work: most les are smalltransfer les rather than disk blocks? reading more common than writing most access is sequential most les have a short lifetimelots of applications generate temporary les (such as a compiler). le sharing (involving writes) is unusualargues for client caching processes use few les les can be divided into classeshandle system les and user les dierently.

Newer Inuences
wide-area networks peer-to-peer mobility untrusted entities

CS 4513

week5-dfs.tex

Distributed File Systems


Primarily look at three distributed le systems as we look at issues. 1. File Transfer Protocol (FTP). Motivation is to provide le sharing (not a distributed le system). 1970s. Connect to a remote machine and interactively send or fetch an arbitrary le. FTP deals with authentication, listing a directory contents, ascii or binary les, etc. Typically, a user connecting to an FTP server must specify an account and password. Often, it is convenient to set up a special account in which no password is needed. Such systems provide a service called anonymous FTP where userid is anonymous and password is typically user email address. Has largely been superceded by use of HTTP for le transfer. 2. Suns Network File System (NFS). Motivated by wanting to extend a Unix le system to a distributed environment. Easy le sharing and compatability with existing systems. Mid-1980s. Stateless in that servers do not maintain state about clients. RPC calls supported: searching for a le within a directory reading a set of directory entries manipulating links and directories accessing le attributes reading/writing le data Latest version of NFS (version 4) introduces some amount of state. 3. Andrew File System (AFS). Research project at CMU in 1980s. Company called Transarc, acquired by IBM. Primary motivation was to build a scalable distributed le system. Look at pictures. Other older le systems: 1. CODA: AFS spin-o at CMU. Disconnection and fault recovery. 2. Sprite: research project at UCB in 1980s. To build a distributed Unix system. 3. Echo. Digital SRC. 4. Amoeba Bullet File Server: Tanenbaum research project. 5. xFs: serverless le systemle system distributed across multiple machines. Research project at UCB. CS 4513 2 week5-dfs.tex

Distributed File System Issues


Naming
How are les named? Access independent? Is the name location independent? FTP. location and access dependent. NFS. location dependent through client mount points. Largely transparent for ordinary users, but the same remote le system could be mounted dierently on dierent machines. Access independent. See Fig 9-3. Has automount feature for le systems to be mounted on demand. All clients could be congured to have same naming structure. AFS. location independent. Each client has the same look within a cell . Have a cell at each site. See Fig 13-15.

Migration
Can les be migrated between le server machines? What must clients be aware of? FTP. Sure, but end-user must be aware. NFS. Must change mount points on the client machines. AFS. On a per-volume (collection of les managed as a single unit) basis.

Directories
Are directories and les handled with the same or a dierent mechanism? FTP. Directory listing handled as remote command. NFS. Unix-like. AFS. Unix-like. Amoeba has separate mechanism for directories and les.

CS 4513

week5-dfs.tex

Sharing Semantics
What type of le sharing semantics are supported if two processes accessing the same le? Possibilities: Unix semantics every operation on a le is instantly visible to all processes. session semantics no changes are visible to other processes until the le is closed. immutable les les cannot be changed (new versions must be created) FTP. User-level copies. No support. NFS. Mostly Unix semantics. AFS. Session semantics. Immutable les in Amoeba.

CS 4513

week5-dfs.tex

Caching
What, if any, le caching is supported? Possibilities: write-through all changes made on client are immediately written through to server write-back changes made on client are cached for some amount of time before being written back to server. write-on-close one type of write-back where changes are written on close (matches session semantics). FTP. None. User maintains own copy (whole le) NFS. File attributes (inodes) and le data blocks are cached separately. Cached attributes are validated with the server on le open. Version 3: Uses read-ahead and delayed writes from client cache. Time-based at block level. New/changed les may not visible for 30 seconds. Neither Unix nor session semantics. Non-deterministic semantics as multiple processes can have the same le open for writing. Version 4: Client must ush modied le contents back to the server on close of le at client. Server can also delegate a le to a client so that the client can handle all requests for the le without checking with the server. However, server must now maintain state about open delegations and recall (with a callback) a delegation if the le is needed on another machine. AFS. File-level caching with callbacks (explain). Session semantics. Concurrent sharing is not possible.

CS 4513

week5-dfs.tex

Locking
Does the system support locking of les? FTP. N/A. NFS. Has mechanism, but external to NFS in v3. Internal to le system in version 4. AFS. Does support.

Replication/Reliability
Is le replication/reliability supported and how? FTP. No. NFS. minimal support in version 4. AFS. For read-only volumes within a cell. For example binaries and system libraries.

CS 4513

week5-dfs.tex

Scalability
Is the system scalable? FTP. Yes. Millions of users. NFS. Not so much. 10-100s AFS. Better than NFS, keep trac away from le servers. 1000s.

Homogeneity
Is hardware/software homogeneity required? FTP. No. NFS. No. AFS. No.

File System Interface


Is the application interface compatible to Unix or is another interface used? FTP. Separate. NFS. The same. AFS. The same.

CS 4513

week5-dfs.tex

Security
What security and protection features are available to control access? FTP. Account/password authorization. NFS. RPC Unix authentication. Version 4 uses RPCSEC GSS, a general security framework that can use proven security mechanisms such as Kerberos. AFS. Unix permissions for les, access control lists for directories. CODA has secure RPC implementation.

State/Stateless
Do le system servers maintain state about clients? FTP. No. NFS. No. In Version 4 servers maintains state about delegations and le locking. AFS. Yes.

CS 4513

week5-dfs.tex

AFS Design Principles


What was learned. Think about for le systems and other large distributed systems. Workstations have cycles to burn. Make clients do work whenever possible. Cache whenever possible. Exploit le usage properties. Understand them. One-third of Unix les are temporary. Minimize system-wide knowledge and change. Do not hardwire locations. Trust the fewest possible entities. Do not trust workstations. Batch if possible to group operations.

CS 4513

week5-dfs.tex

Elephant: The File System that Never Forgets


by Santry, et al (U. British Columbia and HP) in USENIX OSDI99 Motivation that disks and storage are cheap and information is valuable. Straightforward idea to store all (signicant) versions of a le without need for user intervention. All user operations are reversible. Simple, but powerful goal for the system. A new version of a le is created each time it is writtensimilarities to a log-structured le system. File versions are referenced by time and extend to directories. Per-le and per-le-group policies for reclaiming le storage.

What Files to Keep?


Basic idea is to keep landmark or distinguished le versions and discard the others. Keep Onecurrent situation. Good for unimportant or easily recreatable les. Keep Allcomplete history maintained. Keep Landmarkshow to determine user-dened landmarks (similar to check-in idea in RCS) are allowed heuristic to tag other versions as landmarks. Not all les should be treated the same. For example object les and source les have dierent characteristics.

CS 4513

10

week5-dfs.tex

Google File System


From paper by Ghemawat, et al (Google), ACM SOSP03 Design of a le systems for a dierent environment where assumptions of a general purpose le system do not holdinteresting to see how new assumptions lead to a dierent type of system. Dierences: 1. component failures are the norm. 2. huge les (not just the occasional le) 3. append rather than overwrite is typical 4. co-design of application and le system APIspecialization. For example can have relaxed consistency.

Architecture
Single master and multiple chunkservers as shown in Fig 1. Each is a commodity Linux server. Files stored in xed-size 64MB chunks as Linux les. Each has a 64-bit chunk handle. By default have three replicas for each chunk. GFS maintains metale information. Clients do not cache datatypically not reused. Do cache metadata. Large chunk sizes help to minimize client interaction with master (potential bottleneck). Client can maintain persistent TCP connection with chunkserver. Reduces amount of metadata at master.

CS 4513

11

week5-dfs.tex

Shark: Scaling File Servers via Cooperative Caching


by Annapureddy, et al (NYU), USENIX NSDI05 Motivated by distributed computing environments where computations are made in replicated execution environments. Lots of replication drawbacks: bandwidth needed, requires hard state at each replication, and replicated run-time environment not the same as devlp env. Shark designed to support widely distributed applications. Can export a le system Scalability through a location-aware cooperative cachea p2p le system for read sharing. At heart is centralized le system like NFS.

Design
Key ideas: Once a client retrieves a le, it becomes a replica proxy for serving to other clients. Files are stored and retrieved as chunks. Client can retrieve chunks from multiple locations. A token is assigned for the whole le and for each chunk. Use Rabin ngerprint algorithm to preserve data commonality in chunksidea is that dierent versions of a le have many chunks in common.

CS 4513

12

week5-dfs.tex

File Consistency
Uses leases and whole-le cachingala AFS. Default lease of 5min with callbacks. Must refetch entire le if modiedbut may not have to retrieve all chunks and can do so from client proxies.

Summary
Key ideas: centralized administration shared reads through cooperative caching smart chunking parallel chunk download distributed index using tokens

CS 4513

13

week5-dfs.tex

You might also like