Case Study: Google File System

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

PRACTICAL ASSIGNMENT NO.

-12

TITLE​: Case Study: Google File System

(GFS) ​ROLL NO._NAME ->64_PRERNA JAIN

THEORY​:

GOOGLE FILE SYSTEM​:

Defination:​ ​The Google file system (GFS) is a distributed file system (DFS) for data
centric applications with robustness, scalability, and reliability . GFS can be implemented in
commodity servers to support large-scale file applications with high performance and high
reliability.

Motivation for the GFS Design​: ​The GFS was developed based on the following
assumptions:

a)Systems are prone to failure. Hence there is a need for self monitoring and self recovery
from failure.

b) The file system stores a modest number of large files, where the file size is greater than
100 MB.

c)There are two types of reads: Large streaming reads of 1MB or more. These types of
reads are from a contiguous region of a file by the same client. The other is a set of small
random reads of a few KBs.

d) Need to support multiple clients concurrently appending the same

file. ​Architecture​: ​Master – Chunk Servers – Client

GFS is clusters of computers. A cluster is simply a network of computers. Each cluster


might contain hundreds or even thousands of machines. GFS cluster comprises of single
master and multiple chunk servers accessed by multiple clients.In each GFS clusters there
are ​three main entities​:

1. Clients
2. Master servers
3. Chunk servers.
Since files to be stored in GFS are huge, processing and transferring such huge files can
consume a lot of bandwidth.
To efficiently utilize bandwidth files are divided into large 64 MB size chunks which are
identified by unique 64-bit chunk handle assigned by master.
The design supports the usual posix operations open, close, read, write. In addition
provides two more operations:
1. Record append​: atomic append operation.
2. Snapshot​: copy of a file or directory instantaneously​.

Client ​: ​can be other computers or computer applications and make a file request.
Requests can range from retrieving and manipulating existing files to creating new files on
the system. Clients can be thought as customers of the GFS.
➢ ​Library code linked into each applications.
➢ ​Communicates with GFS master for metadata operations[control plane].
➢ ​Communicates with chunkservers for read/write operations[data plane].

Master Server​: ​is the coordinator for the cluster. Its task include:-
1. Maintaining an operation log, that keeps track of the activities of the cluster. The
operation log helps keep service interruptions to a minimum if the master server crashes, a
replacement server that has monitored the operation log can take its place.​ ​2. ​The master
server also keeps track of ​metadata​, which is the information that describes ​ ​chunks. The
metadata tells the master server to which files the chunks belong and where ​ ​they fit within
the overall file.

Chunk Servers ​: ​are the workhorses of the GFS.


➢ ​They store 64-MB file chunks. The chunk servers don't send chunks to the master
server.
➢ ​Instead, they send requested chunks directly to the client. The GFS copies every
chunk multiple times and stores it on different chunk servers.
➢ ​Each copy is called a ​replica​. By default, the GFS makes three replicas per chunk,
but users can change the setting and make more or fewer replicas if desired.

Chunk Size: ​The GFS uses a large chunk size of 64MB. This has the following
advantages:
a. Reduces clients’ need to interact with the master because reads and writes on the same
chunk require only one initial request to the master for chunk location information.​ ​b. Reduce
network overhead by keeping a persistent TCP connection to the chunkserver ​ ​over an
extended period of time.
c. Reduces the size of the metadata stored on the master. This allows keeping the
metadata in memory of master.

No caching​: ​File data is not cached by the client or chunkserver. Large streaming reads
offer little caching benefits since most of the cache data will always be overwritten.​ ​Single
Master: ​Simplifies design and allows a simple centralized management. Master ​ ​stores
metadata and co-ordinates access. All metadata is stored in master’s memory which ​ ​makes
operations fast. It maintains 64 bytes/chunk. Hence master memory is not a serious ​ ​bottle
neck. In order to minimize master involvement lease mechanism is used. Lease is ​ ​used to
maintain a consistent mutation (append or write) order across replicas.​ ​Garbage
collection​: ​The system has a unique approach for this. Once a file is deleted ​ ​its
resources are not reclaimed immediately instead they are renamed with hidden ​ ​namespace.
Such files are removed if they exist for 3 days during the regular scan. ​The ​ ​advantages
offered by it are:
1) simple
2) deleting of files can take place during master’s idle periods and
3) safety against accidental deletion.

Working Of GFS​:
File Access Method :
File Read ​:
A simple file read is performed as follows:
a)Client translates the file name and byte offset specified by the application into a chunk
index within the file using the fixed chunk size.
b) It sends the master a request containing the file name and chunk index. ​ ​c)The master
replies with the corresponding chunk handle and locations of the replicas. The ​ ​client caches
this information using the file name and chunk index as the key.​ ​d)The client then sends a
request to one of the replicas, most likely the closest one. The ​ ​request specifies the chunk
handle and a byte range within that chunk. ​ ​e)Further reads of the same chunk require no
more client-master interaction until the ​ ​cached information expires or the file is reopened.
Fig
1: File Read System
File Write​:
The control flow of a write is given below as numbered steps:
1)Client translates the file name and byte offset specified by the application into a chunk
index within the file using the fixed chunk size. It sends the master a request containing the
file name and chunk index.
2) The master replies with the corresponding chunk handle and locations of the replicas ​ ​3)
The client pushes the data to all the replicas. Data stored in internal buffer of ​ ​chunkserver.
4) Client sends a write request to the primary. The primary assigns serial numbers to all the
write requests it receives. Perform write on data it stores in the serial number order.​ ​5) The
primary forwards the write request to all secondary replicas.
6)The secondaries all reply to the primary on completion of write.
7) The primary replies to the client.

Relaxed consistency model:


1) File namespace mutations are always atomic.
2) File region is consistent if all clients read same values from replicas.
3) File region is defined if clients see mutation writes in entirety.
4) Operation log ensures metadata stored by master is always consistent and defined​.

Advantages and disadvantages of large sized chunks in Google File


System:
Advantages:
1. It reduces client’s need to interact with the master because reads and writes on the same
chunk require only one initial request to the master for chunk location information.​ ​2. Since
on a large chunk, a client is more likely to perform many operations on a given ​ ​chunk, it can
reduce network overhead by keeping a persistent TCP connection to the ​ ​chunk server over
an extended period of time.
3. It reduces the size of the metadata stored on the master. This allows us to keep the
metadata in memory, which in turn brings other advantages.

Disadvantages:
1. Lazy space allocation avoids wasting space due to internal fragmentation.​ ​2. Even with
lazy space allocation, a small file consists of a small number of chunks, ​ ​perhaps just one.
3)The chunk servers storing those chunks may become hot spots if many clients are
accessing the same file.
In practice, ​hot spots have not been a major issue because the applications mostly
read large multi-chunk files sequentially​. To mitigate it, ​replication and allowance to
read from other clients can be done.

Applications:

1) Size of storage increased in the range of petabytes. The amount of metadata maintained
by master increased and scanning through such large amounts became an issue. The
single master started becoming a bottleneck when thousand client requests came
simultaneously.
2) 64 MB standard chunk size design choice created problems when application mix
evolved. The system had to deal with applications generating large number of small files
e.g.: Gmail.
3) Original design choice sacrificed latency. However, building a lot of latency sensitive and
user centered applications like Gmail and YouTube on top a file system intended for batch
oriented applications was a major challenge.

In conclusion, ​the paper introduces some new approaches in distributed file system
like spreading file’s data across storage, single master and appends writes​. However,
this is still an industry paper where the system was designed and built according to their
needs. Achieving a stable and a neat design is not the goal of the authors but rather cheat if
necessary to get the desired performance out of the system.
Comparison to Other Systems ​:

• ​Provides location independent namespace which enables data to be move transparently


for load balance and fault tolerance (i.e. AFS).
• Spreads data across storage servers unlinke AFS.
• Unlike RAID uses simple file replication.
• Does not provide caching below the file system.
• Single master, rather than distributed.
• Provides POSIX-like interface, but not full support.
• HDFS (Hadoop) is an open source implementation of Google File System written in Java.
It follows the same overall design, but differs in supported features and implementation
details:
• Does not support random writes.
• Does not support appending to existing files.
• Does not support multiple concurrent writers.

Comparison:
Comparing GFS with other distributed file system like Sun Network file system (NFS) and Andrew
File system (AFS).
GFS NFS AWS

Cluster based Client-Server based Cluster based


architecture architecture architecture

No caching Client and server Client caching


caching

Not similar to UNIX Similar to UNIX Similar to UNIX

End users do not End users interact End users interact


interact.

File data is stored Reads come from Reads come from


across different same server. same file server.
chunk ​servers thus
reads ​come from
different ​chunk
servers.

Server Replication No replication Server replication

Location independent Not location Location independent


namespace independent namespace
namespace.

Lease based locking Lease based locking Lease based locking

You might also like