Zookeeper Tutorial: What Is, Architecture of Apache Zookeeper
Zookeeper Tutorial: What Is, Architecture of Apache Zookeeper
Apache ZooKeeper
ByDavid TaylorUpdatedApril 16, 2022
What is Zookeeper?
Apache Zookeeper is an open source distributed coordination service that helps
to manage a large set of hosts. Management and coordination in a distributed
environment is tricky. Zookeeper automates this process and allows developers to
focus on building software features rather than worry about it’s distributed nature.
Zookeeper helps you to maintain configuration information, naming, group
services for distributed applications. It implements different protocols on the
cluster so that the application should not implement on their own. It provides a
single coherent view of multiple machines.
Client: Client is one of the nodes in the distributed application cluster. It helps you
to accesses information from the server. Every client sends a message to the server
at regular intervals that helps the server to know that the client is alive.
Leader: One of the servers is designated a Leader. It gives all the information to the
clients as well as an acknowledgment that the server is alive. It would performs
automatic recovery if any of the connected nodes failed.
The zookeeper data model follows a Hierarchal namespace where each node
is called a ZNode. A node is a system where the cluster runs.
Every ZNode has data. It may or may not have children
ZNode paths:
Canonical, slash-separated and absolute
Not use any relative references
Names may have Unicode characters
ZNode maintains stat structure and version number for data changes.
Ephemeral znode: This type of zookeeper znode are alive until the client is alive.
Therefore, when the client gets a disconnect from the zookeeper, it will also be
deleted. Moreover, ephemeral nodes are not allowed to have children.
ZDM- Watches
Zookeeper, a watch event is a one-time trigger which is sent to the client that set
watch. It occurred when data from that watch changes. ZDM watch allows clients
to get notifications when znode changes. ZDM read operations like getData(),
getChidleren(), exist have the option of setting a watch.
Watches are ordered, the order of watch events corresponds to the order of the
updates. A client will able to see a watch event for znode before seeing the new
data which corresponds to that znode.
CREATE
READ
WRITE
DELETE
ADMIN
Before executing any request, it is important that the client must establish a
session with service
All operations clients are sent to service are automatically associated with a
session
The client may connect to any server in the cluster. But it will connect to only
a single server
The session provides “order guarantees”. The requests in the session are
executed in FIFO order
The main states for a session are 1) Connecting, 2) Connected 3) Closed 4)
Not Connected.
Summary
A distributed application is an application which can run on multiple
systems in a network
Apache Zookeeper is an open source distributed coordination service that
helps you manage a large set of hosts
It allows for mutual exclusion and cooperation between server processes
Server, Client, Leader, Follower, Ensemble/Cluster, ZooKeeper WebUI are
important zookeeper components
Three types of Znodes are Persistence, Ephemeral and sequential
ZDM watch is a one-time trigger which is sent to the client that set watch. It
occurred when data from that watch changes
Zookeeper Hadoop uses ACLs to control access to its znodes
Managing the configuration, Naming services., selecting the leader, Queuing
the messages, Managing the notification system, Synchronization,
Distributed Cluster Management, etc.
Yahoo, Facebook, eBay, Twitter, Netflix are some known companies using
zookeeper
The main drawback of tool is that loss may occur if you are adding new
Zookeeper Servers