What Is A Distributed File System?: Dfs Has Two Important Goals
What Is A Distributed File System?: Dfs Has Two Important Goals
File servers: Dedicated to storing files and perform file access operations
Clients: Used solely for computational purposes. They access files stored on the
servers
Client machines can be equipped with local disk storage that can be used for
caching remote files, as a swap area or a storage area.
ARCHITECTURE OF DFS
It is a process which maps names specified by users to store objects such as files
and directories. This is also known as name resolution, and it occurs when a
process refers to a file or directory for the first time.
CACHE MANAGER
It is a process that implements file caching. In file caching, a copy of file stored at
a remote file server is brought to client’s machine when referenced by client.
Subsequent access to the same file by the client can be fetched from client’s
cache instead of getting it from remote file server thus reducing network latency.
Cache managers are present both at clients and servers. At server side, cache
managers cache files in the main memory to reduce disk latency. If multiple
clients cache the same file and try to modify it, then the copies become
inconsistent. To avoid this inconsistency, cache managers at client and server
should coordinate with each other during data storage and retrieval operations.
DATA ACCESS ACTIONS IN DFS
Each client must individually mount every required file system (Sun NFS). Since
each client can mount a file system at any node in the namespace tree, every
client need not see an identical file namespace.
Mount information can be maintained at servers, in which case every client can
see an identical file namespace (Sprite file system). If files are moved to different
servers, then mount information needs to be updated only at the servers. In the
first approach, every client needs to update its mount table.
CACHING
This mechanism is used in DFS to reduce delays in accessing data. In file caching, a
copy of data stored at remote file server is brought to client when referenced by
client. Data can be cached in main memory or on the local disk of the clients. Data
is cached in main memory at servers to reduce disk access latency.
Caching results in the cache consistency problem. Avoided by the great level of
cooperation between file servers and clients which is very expensive. Alternative
method is to treat cached data as hints that is cached data are not expected to be
completely accurate. Example, After the name of file or directory is mapped to
physical object, the address of object can be stored as hint in the cache. If the
address fails to map to the object, the cached address is deleted from the cache.
File server consults the name server to obtain the actual location of file and
updates the cache.
In this mechanism, multiple consecutive data blocks are transferred from server
to client. This reduces file access overhead by obtaining multiple number of blocks
with a single disk seek time. This mechanism is used since the fact that most files
are accessed in their entirety.
ENCRYPTION
This mechanism is used for security in Distributed systems. In this, two entities
which want to communicate establish a key for conversation with the help of
authentication server.