GFD Summary
GFD Summary
Introduction
GFS was created to meet Google’s unique storage needs, where conventional file systems proved
inefficient. The key characteristics that shaped its design include:
• Component Failures as the Norm: Hardware failures are frequent and must be managed
transparently.
• Large File Sizes: Most files are multi-gigabyte in size.
• Workload Characteristics: Workloads consist of large streaming reads, frequent appends,
and minimal random writes.
• High Throughput: The system prioritizes throughput over low latency.
2. Design Overview
GFS follows a master-slave architecture, where:
• Master Server: Maintains metadata and manages file system operations.
• Chunkservers: Store actual file data in fixed-size chunks (typically 64 MB) and replicate
them.
• Clients: Interact with the master for metadata and communicate directly with chunkservers
for data operations.
Key design decisions include:
• File Mutability: Files are mostly appended, not modified in place.
• Relaxed Consistency Model: System ensures high availability but does not strictly enforce
consistency.
• Automatic Recovery Mechanisms: Self-healing replication and rebalancing of chunks
across chunkservers.
3. Architecture
3.1 Master Server
• Stores namespace, file-to-chunk mapping, and chunk metadata.
• Keeps all metadata in memory for fast access.
• Logs changes persistently and periodically checkpoints the state.
• Assigns and reassigns chunks to chunkservers dynamically.
3.2 Chunkservers
• Store file chunks, each identified by a unique 64-bit chunk handle.
• Chunks are replicated (default: 3 replicas) for fault tolerance.
• Periodically communicate with the master to report chunk health.
3.3 Clients
• Query the master for chunk locations and cache this information.
• Interact directly with chunkservers for reading/writing data.
• Minimize interaction with the master to reduce bottlenecks.
4. Data Consistency Model
4.1 Consistency Guarantees
GFS provides relaxed consistency, meaning:
• Writes are atomic at the chunk level but not always immediately consistent.
• The system ensures eventual consistency, meaning a consistent state is reached given
sufficient time.
5. System Interactions
5.1 File Reads
1. Client requests chunk location from master.
2. Master returns chunkserver locations.
3. Client reads directly from the chunkserver.
6. Fault Tolerance
6.1 Chunk Replication
• Chunks are replicated across multiple chunkservers.
• Master ensures replication levels are maintained.
7. Performance Optimizations
7.1 Caching
• Clients cache metadata to reduce master load.
7.3 Rebalancing
• Underutilized chunkservers are assigned additional chunks.
8. Real-World Deployment
• GFS powers Google’s large-scale applications, including indexing and data processing tasks.
• Handles petabytes of data across thousands of machines.
9. Conclusion
GFS is a highly scalable and fault-tolerant file system tailored to Google’s needs. Its design
principles, including relaxed consistency, replication, and self-healing mechanisms, make it well-
suited for large-scale distributed data processing.