Google File System: Abstract
Google File System: Abstract
Abstract:
The great success of Google inc. is attributed not only to its efficient search
algorithm, but also to the underlying commodity hardware and, thus the file
System. as the number of applications run by Google increased massively,
Google’s goal became to build a vast storage network out of inexpensive
commodity hardware. Google created its own file system, named as Google file
System. Google file system was innovatively created by Google engineers and
ready for production in record time in a span of one in 2003, which speeded
Google’s market thereafter. Google file system is the largest file system in
operation. Formally, Google file System (GFS) is a scalable distributed file system
for large distributed data intensive applications.
In the design phase of GFS, points which were given stress includes
component failures are the norm rather than the exception, files are used in the
order of MB & TB and files are mutated by appending data. The entire file system
is organized hierarchically in directories and identified by pathnames. The
architecture comprises of a single master, multiple chunk servers and multiple
clients. Files are divided into chunk, which is the key design parameter. Google file
system also uses leases and mutation order in their design to achieve consistency
and atomicity. As of fault tolerance, GFS is highly available, replicas of chunk
servers and master exists.