0% found this document useful (0 votes)
9 views

Lec 11 - Distributed Files - Distributed File System

The document discusses distributed file systems and describes how a file system can be extended over a network. It provides details on client-server architecture and covers Sun NFS as an example of a distributed file system.

Uploaded by

Junaid Khan
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Lec 11 - Distributed Files - Distributed File System

The document discusses distributed file systems and describes how a file system can be extended over a network. It provides details on client-server architecture and covers Sun NFS as an example of a distributed file system.

Uploaded by

Junaid Khan
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 33

DISTRIBUTED

FILE SYSTEM
Extending the file system over a network
File System
◦ A file system is used to control how data is stored and retrieved
◦ Without a file system, information placed in a storage area would be one
large body of data with no way to tell where one piece of information stops
and the next begins
◦ By separating the data into individual pieces or files, and giving each piece a
name, the information is easily separated and identified
◦ A file is a logical organization of data
◦ There are many different kinds of file systems
◦ Each one has different structure and logic, properties of speed, flexibility,
security, size and more.
◦ Some file systems have been designed to be used for specific applications

April 2018 Distributed File System 2


File System

April 2018 Distributed File System 3


File System
◦ File systems can be used on many different kinds of storage devices.
◦ Virtual File Systems provide file access via a network protocol 
◦ Some file systems are "virtual", in that the "files" supplied are computed on
request or are merely a mapping into a different file system used as a
backing store.
◦ The file system manages access to both the content of files and
the metadata about those files.
◦ It is responsible for arranging storage space; reliability, efficiency, and
tuning with regard to the physical storage medium are important design
considerations

April 2018 Distributed File System 4


Schematic View of Virtual File System

April 2018 Distributed File System 5


Properties of Storage Systems
Sharing Persista Distributed Consistency Example
nce cache

Main memory
1 RAM

File system 1 UNIX file system

Distributed file system Sun NFS


Web server
Web

Distributed shared memory Ivy (DSM, Ch. 18)

Remote objects (RMI/ORB) 1 CORBA

Persistent object store 1 CORBA Persistent


Object Service
Peer-to-peer storage system 2 OceanStore (Ch. 10)

Types of consistency:
1: strict one-copy. : slightly weaker guarantees. 2: considerably weaker guarantees.

April 2018 Distributed File System 6


File System Modules

Directory module: relates file names to file IDs

File module: relates file IDs to particular files


Access control module: checks permission for operation requested

File access module: reads or writes file data or attributes

Block module: accesses and allocates disk blocks

Device module: disk I/O and buffering

April 2018 Distributed File System 7


File Attributes
File length
Creation timestamp
Read timestamp
Write timestamp
Attribute timestamp
Reference count
Owner
File type
Access control list

April 2018 Distributed File System 8


April 2018

DISTRIBUTION
OF FILE SYSTEM
Over a network
Distributed File System 9
Distributed File System
◦ A file system that is distributed across a network
i.e., VFS
◦ Follows a Client-Server Architecture
◦ Microsoft’s DFS

April 2018 Distributed File System 10


Distributed File System
Definition
◦ Implement a common file system that can be shared by all autonomous
computers in a distributed system
Goals
◦ Network transparency
◦ High availability
Architectural options
◦ Fully distributed: files distributed to all sites (participants)
◦ Issues: performance, implementation complexity
◦ Client-server Model:
◦ Fileserver: dedicated sites storing files perform storage and retrieval operations
◦ Client: rest of the sites use servers to access files

April 2018 Distributed File System 11


Client-Server Based DFS

April 2018 Distributed File System 12


Architecture
Client computer Server computer

Directory
Application Application
service

Network Flat file


service
Client
module

April 2018 Distributed File System 13


Flow Model of a DFS

April 2018 Distributed File System 14


April 2018

SUN NFS
A distributed, virtual file system

Distributed File System 15


Sun NFS
◦ Network File System (NFS) is a distributed file
system protocol originally developed by Sun
Microsystems in 1984, allowing a user on a
client computer to access files over a network much like
local storage is accessed.
◦ NFS, like many other protocols, builds on RPC system.
◦ The Network File System is an open standard defined in
RFCs, allowing anyone to implement the protocol.

April 2018 Distributed File System 16


Architecture - NFS
◦ Virtual File System (VFS): File system interface that allows
NFS to support different file systems
◦ Requests for operation on remote files are routed by VFS to NFS
◦ Requests are sent to the VFS on the remote using
◦ The remote procedure call (RPC), and
◦ The external data representation (XDR)
◦ VFS on the remote server initiates files system operation locally
◦ Vnode (Virtual Node):
◦ There is a network-wide vnode for every object in the file system
(file or directory)- equivalent of UNIX inode
◦ vnode has a mount table, allowing any node to be a mount node

April 2018 Distributed File System 17


Sun NFS
Client computer
Server computer

Application Application
program program
UNIX
system calls
UNIX kernel

UNIX kernel
Virtual file system Virtual file system

Local Remote
file system

UNIX UNIX
NFS NFS
Other

file file
client server
system system
NFS
protocol

April 2018 Distributed File System 18


Sun NFS

April 2018 Distributed File System 19


File System on NFS
Server 1 Client Server 2
(root) (root) (root)

export . . . vmunix usr nfs

Remote Remote
people students x staff users
mount mount

big jon bob . . . jim ann jane joe


Note: The file system mounted at /usr/students in the client is actually the sub-tree located at /export/people in Server 1;
the file system mounted at /usr/staff in the client is actually the sub-tree located at /fs/nusers in Server 2.

April 2018 Distributed File System 20


April 2018

STORAGE AREA
NETWORK
Accessing files over a LAN

Distributed File System 21


SAN
◦ A storage area network (SAN) is a network that provides access to
consolidated, block level data storage.
◦ SANs are primarily used to enhance storage devices, such as disk arrays, tape
libraries, and optical jukeboxes, accessible to servers so that the devices appear
to the operating system as locally attached devices.
◦ A SAN typically has its own network of storage devices that are generally not
accessible through the local area network (LAN) by other devices.
◦ A SAN does not provide file abstraction, only block-level operations.
◦ However, file systems built on top of SANs do provide file-level access, and are
known as shared-disk file systems.

April 2018 Distributed File System 22


SAN

April 2018 Distributed File System 23


SAN

April 2018 Distributed File System 24


April 2018

DESIGN ISSUES
Challenges in DFS
Distributed File System 25
Naming

◦ Name: each object in a file system


(file, directory) has a unique name
◦ Name resolution: mapping a name to an
object or multiple objects (replication)
◦ Name space: collection of names with or
without same resolution mechanism

April 2018 Distributed File System 26


Naming - Solutions
◦ Concatenate name of host to names of files on that host
◦ Advantage: unique filenames, simple resolution
◦ Disadvantages:
◦ Conflicts with network transparency
◦ Moving file to another host requires changing its name and the applications using it
◦ Mount remote directories onto local directories
◦ Requires that host of remote directory is known
◦ After mounting, files referenced are location-transparent i.e., file name
does not reveal its location
◦ Have a single global directory
◦ All files belong to a single name space
◦ Limitation: having unique system wide filenames require a single
computing facility or cooperating facilities

April 2018 Distributed File System 27


Name Resolution
◦ Contexts
◦ Solve the problem of system-wide unique names, by partitioning a name space into contexts
(geographical, organizational, etc.)
◦ Name resolution is done within that context
◦ Interpretation may lead to another context
◦ File Name = Context + Name local to context
◦ Nameserver THE DNS APPROACH
◦ Process that maps file names to objects (files, directories)
◦ Implementation options
◦ Single name Server
◦ Simple implementation, reliability and performance issues
◦ Several Name Servers (on different hosts)
◦ Each server responsible for a domain
◦ Example:
Client requests access to file ‘A/B/C’
Local name server looks up a table (in kernel)
Local name server points to a remote server for ‘/B/C’ mapping

April 2018 Distributed File System 28


Cache Consistency
◦ How to keep caches of clients and server consistent?
◦ Server initiated
◦ Server informs cache managers when data in client caches is stale
◦ Client cache managers invalidate stale data or retrieve new data
◦ Disadvantage: extensive communication
◦  Client initiated
◦ Cache managers at the clients validate data with server before returning it to
clients
◦ Disadvantage: extensive communication
◦ Prohibit file caching when concurrent-writing
◦ Several clients open a file, at least one of them for writing
◦ Server informs all clients to purge that cached file
◦ Lock files when concurrent-write sharing (at least one client opens for write

April 2018 Distributed File System 29


Writing Policy
◦ Once a client writes into a file (and the local cache), when should the
modified cache be sent to the server?
◦ Options:
◦ Write-through: all writes at the clients, immediately transferred to the servers
◦ Advantage: reliability
◦ Disadvantage: performance, it does not take advantage of the cache
◦ Delayed writing: delay transfer to servers
◦ Advantages:
◦ Many writes take place (including intermediate results) before a transfer
◦ Some data may be deleted
◦ Disadvantage: reliability
◦ Delayed writing until file is closed at client
◦ For short open intervals, same as delayed writing
◦ For long intervals, reliability problems

April 2018 Distributed File System 30


Availability
◦ What is the level of availability of files in a distributed file system?
◦ Use replication to increase availability, i.e. many copies (replicas) of files are
maintained at different sites/servers
◦Replication issues:
◦ How to keep replicas consistent
◦ How to detect inconsistency among replicas
◦Unit of replication
◦ File
◦ Group of files
a) Volume: group of all files of a user or group or all files in a server
◦ Advantage: ease of implementation
◦ Disadvantage: wasteful, user may need only a subset replicated
b) Primary pack vs. pack
◦ Primary pack: all files of a user
◦ Pack: subset of primary pack. Can receive a different degree of replication for each pack

April 2018 Distributed File System 31


Scalability
◦ Can the design support a growing system?
◦ Example: server-initiated cache invalidation complexity
and load grow with size of system. Possible solutions:
◦ Do not provide cache invalidation service for read-only files
◦ Provide design to allow users to share cached data
◦ Design file servers for scalability: threads, SMPs, clusters

April 2018 Distributed File System 32


Semantics

◦ Expected semantics: a read will return data


stored by the latest write
◦ Possible options:
◦ All read and writes go through the server
◦ Disadvantage: communication overhead
◦ Use of lock mechanism
◦ Disadvantage: file not always available

April 2018 Distributed File System 33

You might also like