Chapter 2 HDFS and ZooKeeper
Chapter 2 HDFS and ZooKeeper
Chapter 2 HDFS and ZooKeeper
Foreword
This course describes the big data distributed storage system HDFS and the
ZooKeeper distributed service framework that resolves some frequently-
encountered data management problems in distributed services. This
chapter lays a solid foundation for subsequent component learning.
1 Huawei Confidential
Objectives
2 Huawei Confidential
Contents
2. HDFS-related Concepts
3. HDFS Architecture
4. Key Features
6. ZooKeeper Overview
7. ZooKeeper Architecture
3 Huawei Confidential
Dictionary and File System
4 Huawei Confidential
HDFS Overview
Hadoop Distributed File System (HDFS) is a distributed file system designed to run on
commodity hardware.
HDFS has a high fault tolerance capability and is deployed on cost-effective hardware.
HDFS provides high-throughput access to application data and applies to applications
with large data sets.
HDFS looses some Potable Operating System Interface of UNIX (POSIX) requirements to
implement streaming access to file system data.
HDFS was originally built as the foundation for the Apache Nutch Web search engine
project.
HDFS is a part of the Apache Hadoop Core project.
5 Huawei Confidential
HDFS Application Scenario Example
6 Huawei Confidential
Contents
2. HDFS-related Concepts
3. HDFS Architecture
4. Key Features
6. ZooKeeper Overview
7. ZooKeeper Architecture
7 Huawei Confidential
Computer Cluster Structure
The distributed file system stores files on multiple
computer nodes. Thousands of computer nodes form
a computer cluster.
Currently, the computer cluster used by the
distributed file system consists of common hardware,
which greatly reduces the hardware overhead.
8 Huawei Confidential
Basic System Architecture
HDFS Architecture
Metadata ops
Block ops
Client
Replication
Blocks Blocks
Client
Rack 1 Rack 2
9 Huawei Confidential
Block
The default size of an HDFS block is 128 MB. A file is divided into multiple
blocks, which are used as the storage unit.
The block size is much larger than that of a common file system, minimizing the
addressing overhead.
The abstract block concept brings the following obvious benefits:
Supporting large-scale file storage
Simplifying system design
Applicable to data backup
10 Huawei Confidential
NameNode and DataNode (1)
NameNode DataNode
Metadata is stored in the memory. The file content is stored in the disk.
Saves the mapping between files, Maintains the mapping between block IDs
blocks, and DataNodes. and local files on DataNodes.
11 Huawei Confidential
NameNode and DataNode (2)
12 Huawei Confidential
DataNodes
DataNodes are working nodes that store and read data in HDFS. DataNodes
store and retrieve data based on the scheduling of the clients or NameNodes,
and periodically send the list of stored blocks to the NameNodes.
Data on each DataNode is stored in the local Linux file system of the node.
13 Huawei Confidential
Contents
2. HDFS-related Concepts
3. HDFS Architecture
4. Key Features
6. ZooKeeper Overview
7. ZooKeeper Architecture
14 Huawei Confidential
HDFS Architecture Overview
15 Huawei Confidential
HDFS Namespace Management
The HDFS namespace contains directories, files, and blocks.
HDFS uses the traditional hierarchical file system. Therefore, users can create
and delete directories and files, move files between directories, and rename files
in the same way as using a common file system.
NameNode maintains the file system namespace. Any changes to the file
system namespace or its properties are recorded by the NameNode.
16 Huawei Confidential
Communication Protocol
HDFS is a distributed file system deployed on a cluster. Therefore, a large
amount of data needs to be transmitted over the network.
All HDFS communication protocols are based on the TCP/IP protocol.
The client initiates a TCP connection to the NameNode through a configurable port
and uses the client protocol to interact with the NameNode.
The NameNode and the DataNode interact with each other by using the DataNode
protocol.
The interaction between the client and the DataNode is implemented through the
Remote Procedure Call (RPC). In design, the NameNode does not initiate an RPC
request, but responds to RPC requests from the client and DataNode.
17 Huawei Confidential
Client
The client is the most commonly used method for users to operate HDFS. HDFS
provides a client during deployment.
The HDFS client is a library that contains HDFS file system interfaces that hide
most of the complexity of HDFS implementation.
Strictly speaking, the client is not a part of HDFS.
The client supports common operations such as opening, reading, and writing,
and provides a command line mode similar to Shell to access data in HDFS.
HDFS also provides Java APIs as client programming interfaces for applications
to access the file system.
18 Huawei Confidential
Disadvantages of the HDFS Single-NameNode
Architecture
Only one NameNode is set for HDFS, which greatly simplifies the system design but
also brings some obvious limitations. The details are as follows:
Namespace limitation: NameNodes are stored in the memory. Therefore, the number of
objects (files and blocks) that can be contained in a NameNode is limited by the memory size.
Performance bottleneck: The throughput of the entire distributed file system is limited by the
throughput of a single NameNode.
Isolation: Because there is only one NameNode and one namespace in the cluster, different
applications cannot be isolated.
Cluster availability: Once the only NameNode is faulty, the entire cluster becomes
unavailable.
19 Huawei Confidential
Contents
2. HDFS-related Concepts
3. HDFS Architecture
4. Key Features
6. ZooKeeper Overview
7. ZooKeeper Architecture
20 Huawei Confidential
HDFS High Availability (HA)
Heartbeat
Heartbeat
EditLog
JN JN JN
ZKFC ZKFC
HDFS
Read/write data.
Client
Copy.
21 Huawei Confidential
Metadata Persistence
3. Merge.
FsImage FsImage
.ckpt .ckpt
5. Roll back
FsImage. 4. Upload the newly
generated FsImage file
Editlog FsImage to the active node.
23 Huawei Confidential
HDFS Federation
APP Client-1 Client-k Client-n
Pool
Pool 1 Pool n
Block Pools
Storage
Block
Common Storage
DataNode1 DataNode2 DataNodeN
... ... ...
24 Huawei Confidential
Data Replica Mechanism
Distance=4
Distance=4
Distance=0
26 Huawei Confidential
HDFS Data Integrity Assurance
HDFS aims to ensure the integrity of storage data and ensures the reliability of components.
Rebuilding the replica data of failed data disks
When the DataNode fails to report data to the NameNode periodically, the NameNode initiates the replica rebuilding
action to restore the lost replicas.
Metadata reliability:
The log mechanism is used to operate metadata, and metadata is stored on the active and standby NameNodes.
The snapshot mechanism implements the common snapshot mechanism of file systems, ensuring that data can be
restored in a timely manner in the case of mis-operations.
Security mode:
HDFS provides a unique security mode mechanism to prevent faults from spreading when DataNodes or disks are faulty.
27 Huawei Confidential
Other Key Design Points of the HDFS Architecture
Space reclamation mechanism:
Supports the recycle bin mechanism and dynamic setting of the number of copies.
Data organization:
Data is stored by data block in the HDFS of the operating system.
Access mode:
Provides HDFS data accessing through Java APIs, HTTP or SHELL modes.
28 Huawei Confidential
Common Shell Commands
29 Huawei Confidential
New Features of HDFS 3.0
Erasure Code (EC) in HDFS is supported.
Union based on the HDFS router is supported.
Multiple NameNodes are supported.
Disk balancers are added to DataNodes for load balancing.
30 Huawei Confidential
Contents
2. HDFS-related Concepts
3. HDFS Architecture
4. Key Features
6. ZooKeeper Overview
7. ZooKeeper Architecture
32 Huawei Confidential
HDFS Data Write Process
FSData NameNode
6. Close the file.
OutputStream
Client node
4 4
DataNode DataNode DataNode
5 5
33 Huawei Confidential
HDFS Data Read Process
4. Read data.
34 Huawei Confidential
Contents
2. HDFS-related Concepts
3. HDFS Architecture
4. Key Features
6. ZooKeeper Overview
7. ZooKeeper Architecture
35 Huawei Confidential
ZooKeeper Overview
The ZooKeeper distributed service framework is used to solve some data
management problems that are frequently encountered in distributed
applications and provide distributed and highly available coordination service
capabilities.
In security mode, ZooKeeper depends on Kerberos and LdapServer for security
authentication, but in non-security mode, ZooKeeper does not depend on them
any more. As a bottom-layer component, ZooKeeper is widely used and
depended by upper-layer components, such as Kafka, HDFS, HBase and Storm.
36 Huawei Confidential
Contents
2. HDFS-related Concepts
3. HDFS Architecture
4. Key Features
6. ZooKeeper Overview
7. ZooKeeper Architecture
37 Huawei Confidential
ZooKeeper Service Architecture - Model
ZooKeeper Service
Leader
The ZooKeeper cluster consists of a group of server nodes. In this group, there is only one leader node, and
other nodes are followers.
The leader is elected during the startup.
ZooKeeper uses the custom atomic message protocol to ensure data consistency among nodes in the entire
system.
After receiving a data change request, the leader node writes the data to the disk and then to the memory.
38 Huawei Confidential
ZooKeeper Service Architecture - DR Capability
If the ZooKeeper can complete the election, it can provide services for external
systems.
During ZooKeeper election, if an instance obtains more than half of the votes, the
instance becomes the leader.
For a service with n instances, n may be an odd or even number.
When n is an odd number, assume: n 2x 1 ; then the node needs to obtain x+1
votes to become the leader, and the DR capability is x.
When n is an even number, that: n 2x 2 ; then the node needs to obtain x+2
(more than half) votes to become the leader, and the DR capability is x.
39 Huawei Confidential
Key Features of ZooKeeper
Eventual consistency: All servers are displayed in the same view.
Real-time capability: Clients can obtain server updates and failures within a specified
period of time.
Reliability: A message will be received by all servers.
Wait-free: Slow or faulty clients cannot intervene the requests of rapid clients so that
the requests of each client can be processed effectively.
Atomicity: Data update either succeeds or fails. There are no intermediate states.
Sequence consistency: Updates sent by the client are applied in the sequence in which
they are sent.
40 Huawei Confidential
Read Function of ZooKeeper
According to the consistency of ZooKeeper, the client obtains the same view
regardless of the server connected to the client. Therefore, read operations can
be performed between the client and any node.
Client
41 Huawei Confidential
Write Function of ZooKeeper
1. Write Request
Client
6. Write Response
42 Huawei Confidential
Commands for ZooKeeper Clients
To invoke a ZooKeeper client, run the following command:
43 Huawei Confidential
Summary
The distributed file system is an effective solution for large-scale data storage in the big data era. The open-source
HDFS implements GFS and distributed storage of massive data by using a computer cluster formed by inexpensive
hardware.
HDFS is compatible with inexpensive hardware devices, stream data read and write, large data sets, simple file models,
and powerful cross-platform compatibility. However, HDFS has its own limitations. For example, it is not suitable for
low-latency data access, cannot efficiently store a large number of small files, and does not support multi-user write
and arbitrary file modification.
"Block" is the core concept of HDFS. A large file is split into multiple blocks. HDFS adopts the abstract block concept,
supports large-scale file storage, simplifies system design, and is suitable for data backup.
The ZooKeeper distributed service framework is used to solve some data management problems that are frequently
encountered in distributed applications and provide distributed and highly available coordination service capabilities.
44 Huawei Confidential
Quiz
2. Why is the HDFS data block size larger than the disk block size?
45 Huawei Confidential
Recommendations
46 Huawei Confidential
Thank you. Bring digital to every person, home, and
organization for a fully connected,
intelligent world.