0% found this document useful (0 votes)
27 views4 pages

Checkpointing and Deepdive

Uploaded by

Nadeem Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views4 pages

Checkpointing and Deepdive

Uploaded by

Nadeem Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 4

HDFS CEHCKPOINTING INTRODUCTION AND

DEEPDIVE

Namenode maintains two constructs to manage the metadata, that is FSimage and
editlog

FSimage will be the actual snapshot of what is available within the Namenode
memory,

and edit log will have the historical changes happened within the metadata

let us take an example, assume I am adding a file A into the system

I am moving the file to a different location and I am deleting file A

I am adding another File C. So effectively I added a file and deleted the file A

so there is no point in maintaining the meta information about this file A. I can
regenerate the FSImage

at the Namenode, otherwise for backup purpose, I can take a checkpoint and

generate the new FSImage with the help of two different system, that is secondary

Namenode and standby Namenode and that process we call it as check pointing

new FSimage getting generated will have the existing metadata what ever was
available in the FSimage

initial FSimage assume we already had file 1, it will continue to exist

and in the process or in the mean time, we have made four changes, that is
effectively file A was

added and deleted so it is not existing, so only file C will be effectively will be
existing and meta information about file C will be included

within the FSimage, so the old, the meta information as a part of old FSImage

plus the effective value of what ever that is available

after doing the playback of the editlog will be included as a part of FSImage

and generating this new FSImage, the process we call it as checkpointing

and each entry within the editlog we call it as an segment

so every transaction happening within the metadata will be added as a segment

within the editlog and check point can happen by two systems

one is secondary Namenode which acts as cold backup, on regular interval it will do
the checkpoint

it will communicate with the Namenode gets the necessary files, edits fsimage

creates the new fsimage and push it back to the Namenode, the other system which

which does the check pointing is standby Namenode, which provides the hot backup
again standby Namenode can be used to generate the check pointing an a regular
interval or

through a manual trigger, we can make the standby Namenode to take the checkpoint

even though the standby Namenode is not designed to do the check pointing

it does the high availability of Namenode. additional to high availability support

it does the check pointing as well. The primary responsibility of the standby
Namenode is to

keep the high availability of Namenode, that is different topic, we will see it in
another session, as a part of check pointing, standby Namenode

needs to meet a pre condition, either on a regular interval or some trigger has to
happen through administrative commands

if any of the pre condition is met, it will create a new FSImage and md5 for that
FSimage and

push that particular FSImage into the active namenode

it doesn't need to look for the Namenode for the changes, because standby

Namenode is a hot backup and at any point of time, it will be in sync with the

Namenode through an NFS. If we are going to use secondary Namenode to do the check
pointing

secondary Namenode communicates with the Namenode

on a regular interval, the default is 60 mins. when ever the precondition happens,
that is on a regular interval or

through administrative command, the secondary Namenode will communicate with the
Namenode,

all the communication between the secondary Namenode and Namenode happens through
http

protocol. It will fetch the new FSImage, and the edits that

happened within the Namenode till that particular point. Before providing the
FSImage,

and the edits logs to the secondary namenode, what Namenode does, it rolls all the
edit logs,

basically it does finalizing all the changes,

what ever that happened till now and bundle it as a file and start making the entry
of new entry or any change that is happening in to that

Namenode from that point it will make it as new entry into the Namenode

so it will roll the current edit log as a file .1 or with a prefix and a new log
will get rolled. so from that particular point,
onwards what ever fsimage and edit logs available that will be brought into the
secondary namenode

and if any change happens to the FSImage, within the Namenode, that will get
loaded, into the secondary Namenode, and secondary

Namenode will playback the edit logs and apply all the changes and generate a new
FSImage

and push that new FSImage into the Namenode so that in next restart Namenode can
use the new

FSImage so that restart of Namenode will be fast and quick

In a summary the check pointing is done by secondary Namenode or standby Namenode

so there will be a question, why can't Namenode do the checkpoint, Checkpoint is

heavily CPU intensive and IO intensive process, and we have

to pass the users accessing the files system, when the check point is happening

I cannot hold the users from accessing the data or accessing the hdfs when the
checkpoint process is going on,

so we offload the work, or the Namenode offloads the check pointing process either
to secondary

Namenode or to the standby Namenode. Secondary Namenode is dedicatedly made for

check pointing process and standby Namenode along with the high availability it can
do the check pointing as well

and the interval at which how the check point should happen, where the checkpoint
should happen, all that can be

controlled from the configuration files

so let me check the configurations

what is the checkpoint interval

and where the checkpoint needs to be stored

and whenever the transactions, the number of transactions from editlogs, if it


reaches the threshold limit the checkpoint will also happen

and there will be a threshold limit to keep verifying the number of transactions
whether the checkpoint

should happen or not, that can also be controlled

so down the line we will see in practical, by adding files, by making changes to
hdfs,

how edit files getting updated, how the fsimage getting updated
information alert
Schedule learning time
Learning a little each day adds up. Research shows that students who make learning
a habit are more likely to reach their goals. Set time aside to learn and get
reminders using your learning scheduler.
About this course

You might also like