Distributed File Systems & Name Services: UNIT-4
Distributed File Systems & Name Services: UNIT-4
&
NAME SERVICES
UNIT-4
• A distributed file system enables programs to
store and access remote files exactly as they
do local ones, allowing users to access files
from any computer on a network.
• In this chapter we discuss on simple
architecture for file systems and to describe
the architecture and implementation of basic
distributed file systems.
Two basic distributed file service implementations
that have been in widespread use
Sun Network File System , NFS
Andrew File System, AFS
Each emulates the UNIX file system interface with
differing degree of scalability, fault tolerance and
deviation from strict UNIX one – copy file update
semantics. (Updates are written to the single copy
and are available immediately)
• Sharing of stored information – main aspect of
distributed resource sharing.
• Web servers provides a restricted form of data
sharing in which files are stored locally to the
server that are available to the clients thruout
the internet.
• But data accessed thru the web servers is
managed and updated in the file system at the
server .
• The requirements for sharing within local
networks and intranets lead to a need for a
different type of service- one that support
persistent storage of data and programs of all
types on behalf of clients and consistent
distribution of up-to-date data.
• DFS support the sharing of information in the
form of files and h/w resource in the form of
persistent storage thruout an intranet.
• A file service enables programs to store and
access remote files exactly as they do in local
ones. Allowing users to access files from any
computer in the intranet.
• The need for persistent storage at a few
servers reduce the need for local disk
management and helps economically in
maintaining and archiving persistent data.
• In any organization that operate web servers
for external and internal access via intranet ,
web server often store and access the material
from a local distributed file system.
• The below figure provides an overview of
types of storage system.
Types of consistency:
1: stt one-copy . slightly weaker guarantees. 2: considerably weaker guarantees.
• One-copy consistency: cannot observe any
discrepancies b/w cached copies and stored data after
update.
• But distributed replicas are used, strict consistency is
more difficult to achieve.
• AFS and sun NFS maintains an approximation to strict
consistency.
• The consistency b/w the copies stored at web proxies
and client caches and the original server is maintained
by explicit user actions.
• Clients are not notified when a page stored at the
original server is updated. They perform explicit
checks to keep their local copies up-to-date.
• The CORBA and Persistent Java schemes maintain
single copies of persistent objects and need
remote invocation to access them. So only
consistency issue is b/w the persistent copy of an
object on disk and active copy in the memory.
Characteristic of File Systems
• File Systems are responsible for
Organization
Storage
Retrieval
Naming
Sharing
Protection of Files
They provide file abstraction for freeing programmers from
concern with the details of storage allocation and layout.
• File contains both data and attributes.
• Data consist of sequence of data items- accessible by
operations to read and write any portion of the sequence.
• The attributes are held as a single record containing
information such as,
Length of file
Timestamps
File type
Owner’s identity
Access control list
The shaded attributes are managed by the file system and not
updatable by user programs.
Figure 8.3
File attribute record structure
File length
Creation timestamp
Read timestamp
Write timestamp
Attribute timestamp
Reference count
Owner
File type
Access control list
32 bit 16 bit
Sun N/W File System:
• Under Sun NFS:
Virtual File System
Client Integration
Access Control and Authentication
NFS Server Interface
Mount Service
Path Name Translation
Automounter
Server Caching
Client Caching
• NFS server module resides in the kernel of each
computer.
• Request referring to files in a remote file
system are translated by the client module to
NFS protocol operations and then passed to NFS
server module at the computer holding relevant
file system.
• NFS client and server modules communicate
using remote procedure calling.
Virtual File System:
• NFS provides access transparency
• User programs can issue file operations for local or
remote files without distinction.
• Integration is achieved by virtual file system modules,
which has been added to UNIX kernel to distinguish
b/w local and remote files and to translate b/w UNIX
independent file identifiers used by NFS and internal
file identifiers used in UNIX and other file system.
• VFS keeps track of file system that are
currently available both locally and remotely
and it passes each request to local system
module.(UNIX File System, NFS client module
and other file system)
• File identifiers used in NFS are called file
handles. A file handle is opaque to clients and
contains whatever information the server
needs to distinguish an individual file.
File Handle:
‘Filesystem’ is the set of files held in storage device and the word ‘file system ‘ refers
to s/w component that provides access to files.
• Filesystem identifier is a unique number
allocated to each filesystem when it is created.
• i -node generation number is needed bcos in
conventional UNIX , i- node numbers are
reused after a file is removed.
• In VFS generation no. is stored in each file and
is incremented each time the i-node number
is reused.
• Virtual file system layer has one VFS structure for
each mounted file system and one v-node per open
file.
• VFS structure relates a remote file system to the local
directory on which it is mounted.
• v-node contains an indicator to show whether a file is
local or remote. If it is local , v-node contains a
reference to the index of local file.
• If it is remote , it contains the file handle of the
remote file.
Client Integration:
• It emulates the semantics of standard UNIX file
system primitives and is integrated with UNIX kernel.
• It is integrated with kernel , so user programs can
access files via UNIX system calls without
recompilation or reloading.
• A single client module serves all of the user-levels
processes with a shared cache or recently used
blocks.
• Encryption key used to authenticate user IDs
passed to server is retained in the kernel.
• NFS client module cooperates with VFS in each
client machine.
• Transferring blocks of files to and from the server
and caching the blocks in local memory
whenever possible.
• It shares same buffer cache that is used by local
i/p and o/p systems.
Access Control and Authentication:
• NFS server is stateless and does not keep files
open on behalf of its client.
• So server must check the users identity against
file access permission attributes on each
request.
• Kerberos has been integrated with Sun NFS to
provide solution for user authentication and
security.
NFS Server Interface:
• NF S File access operations like read, write,
getattr, setattr are almost identical to Read,
Write, GetAttributes, SetAttributes defined on
flat file service model.
• lookup operation are similar to those in
directory service model.
Mount Service:
• Mounting of sub trees of remote file system by clients
are supported by a separate mount service process.
• On each server there is a file with name ( /etc/exports)
containing the name of local filesystems that are
available for remote mounting.
• Access list is associated with each filesystem name
indicating which hosts are permitted to mount the file
system.
• Modified mount command communicates with mount
service process on remote host using a mount protocol.
Local and remote file system accessible on an
NFS client
Note: The file system mounted at /usr/students in the client is actually the sub-tree
located at /export/people in Server 1;
the file system mounted at /usr/staff in the client is actually the sub-tree located at
/nfs/users in Server 2.
• Programs running at client can access files at server 1 and
server2 by using pathnames such as /usr/students/jon
and /usr/staff/ann
• Remote file system may be hard mounted or soft mounted.
• Hard mounted means process is suspended until the request
can be completed and if the remote host is unavailable for
any reason the NFS client module continues to retry the
request until it is satisfied.
• Soft mounted - NFS client module returns a failure
indication to user level processes after a small no. of retries.
Path Name Translation:
• Path names are parsed and their translation is
performed in an iterative manner by the client.
• Each part of the name that refers to a remote
mounted directory is translated to file handle
using separate lookup request to the remote
server.
• The lookup operation returns the corresponding
file handle and file attributes.
Automounter:
1. It is used to mount a remote directory dynamically
whenever an empty mount point is referenced by a client.
2. It maintains a table of mount points (pathnames) with a
reference to one or more NFS servers listed against each.
3. When the NFS client module attempts to resolve a
pathname that includes one of these mount points , it
passes a lookup() request to the local automounter which
locates the required filesystem in its table and send a
‘probe’ request to each server listed.
• The filesystem on the first server is mounted
at the client .
• The mounted filesystem is linked to mount
point using a symbolic link so that access to
link will not result in further request to the
automounter.
• Finally automounter unmounts the remote file
system.
Server Caching:
• In UNIX file system , file pages , directories and file
attributes have been read from disk are retained in the
main memory buffer cache until the buffer space is
required for other pages.
• If the process issues the read or write request , it can be
satisfied without another disk access.
• This caching techniques work in this way bcos, read and
write request issued by user level processes pass thru a
single cache . The cache is always kept up-to-date.
• In NFS server, use of server cache to hold recently read disk
blocks and does not raise any consistency problems. But when
a server performs write operation, extra measures are needed.
It offers two options:
• Data in write operations received from clients is stored in the
memory cache at the server and written to disk before a reply is
sent to client. This is write thru cache.
• Data in write operation is stored only in the memory cache. It
will be written to disk when a commit operation is received for
relevant file.It issues commit whenever a file that was open for
writing is closed.
Client Caching:
• Client module caches the result of read, write,
getattr,lookup and readdir operation in order
to reduce the number of requests transmitted
to servers.
• Clients are responsible for polling the server to
check the validity of cached data that they
hold.
Two Timestamps:
• Tc is the time when the cache entry was last validated.
• Tm is the time when the block was last modified at the
server.
• A cache entry is valid at time T , if T-Tc is less than a
freshness interval t or if the value for Tm recorded at
the client matches the value of Tm at the server
validity condition is,
(T – Tc < t ) V (Tm client = Tm server)
• Validity check is performed whenever a cache
entry is used. The first half of validity
condition can be evaluated without access to
the server. If it is true, then the second half
need not be evaluated.
• If it is false , then the current value of Tm server
is obtained and compared with local value Tm
client .If they are same , then the cache entry is
taken to be valid and the value of Tc for that
cache entry is updated to the current time.
• If they differ then the cached data has been
updated at the server and the cache entry is
invalidated , resulting in a request to the server
for the relevant data.