SandeepKumar Das 20020343071

Uploaded by

Sandeep Kumar Das

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views5 pages

SandeepKumar Das 20020343071

Uploaded by

Sandeep Kumar Das

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Name: Sandeep Kumar Das

PRN: 20020343071

Tech Class: C

Operation: Fault Tolerance Case

Types of faults in HDFS:

1. Data Node failure

Hadoop file system is a master/slave file system in which Namenode works as the master and
Datanode work as a slave. Namenode is so critical term to Hadoop file system because it acts as
a central component of HDFS. If Namenode gets down then the whole Hadoop cluster is
inaccessible and considered dead. Datanode stores actual data and works as instructed by
Namenode. A Hadoop file system can have multiple data nodes but only one active Namenode.

In HDFS, each DataNode sends Heartbeat and Data Blocks report to NameNode. Receipt of a
heartbeat implies that the datanode is functioning properly. A block report contains a list of all
blocks on a datanode.
Data node passes a heartbeat signal to Name node in an interval of 2 minutes.When Name node
does not receive heartbeat signals from Data node, it assumes that the data node is either dead
or non-functional.
As soon as the data node is declared dead/non-functional all the data blocks it hosts are
transferred to the other data nodes with which the blocks are replicated initially. This is how
Namenode handles datanode failures.

2. Rack failure

3. Name node failure

If NameNode gets fail the whole Hadoop cluster will not work. Actually, there will not any data
loss only the cluster work will be shut down, because NameNode is only the point of contact to
all DataNodes and if the NameNode fails all communication will stop.
Available solutions to handle Name node failure in Hadoop 1

To handle the single point of failure, we can use another setup configuration which can backup
NameNode metadata. If the primary NameNode will fail our setup can switch to secondary
(backup) and no any type to shutdown will happen for Hadoop cluster.

Available solutions to handle Name node failure in Hadoop 2

HDFS High Availability of Namenode is introduced with Hadoop 2. In these two, separate
machines are getting configured as NameNodes, where one NameNode always in working state
and anther is in standby. Working Name node handling all clients request in the cluster where
standby is behaving as the slave and maintaining enough state to provide a fast failover on
Working Name node.

Fault Tolerance

Fault Tolerance: It means making the data available even in case of some failures.

A distributed system works on cluster of computers.

Most of the time a distributed system will spread the data in partitions over various systems in
the cluster as shown in the figure below.
If 1-2 systems in the cluster fails, we may not be able to read the data. This is a fault which we
can tolerate.

Ways for achieving Fault-Tolerance in Hadoop HDFS

1. Replication Mechanism

Replication Factor: Make multiple copies of the data and keep it on separate systems. If we
have 3 copies of Partition stored in 3 difference machines. We should be able to avoid 2 failures.
Since we have 3 copies on 3 different systems even if 2 of them fails, we can still read our data
from the 3rd system. The tern used for making multiple copies is called Replication factor.
Replication factor = 2, means I am maintaining 2 copies of my Partition.

Replication factor = 3, means each data is replicated 3 times on various data node. This is by
default 3 in Hadoop system.

2. Erasure Coding

Erasure coding is a method used for fault tolerance that durably stores data with significant
space savings compared to replication.

RAID (Redundant Array of Independent Disks) uses Erasure Coding. Erasure coding works by
striping the file into small units and storing them on various disks.

For each strip of the original dataset, a certain number of parity cells are calculated and stored.
If any of the machines fails, the block can be recovered from the parity cell. Erasure
coding reduces the storage overhead to 50%.

Veeam Backup 10 0 User Guide Vsphere PDF
No ratings yet
Veeam Backup 10 0 User Guide Vsphere PDF
1,527 pages
Cambridge International AS & A Level: Information Technology 9626/02
No ratings yet
Cambridge International AS & A Level: Information Technology 9626/02
4 pages
Apex Installtion
100% (1)
Apex Installtion
5 pages
(Nonlinear (6-31) : Structures GTU-Sem. 3-Comp/T) Binary Tree
No ratings yet
(Nonlinear (6-31) : Structures GTU-Sem. 3-Comp/T) Binary Tree
25 pages
RGPV Diploma Question Paper 2020
100% (2)
RGPV Diploma Question Paper 2020
4 pages
Big Data Analytics Quick Guide
100% (1)
Big Data Analytics Quick Guide
53 pages
Entity Relationships To Normal Forms
No ratings yet
Entity Relationships To Normal Forms
57 pages
Level Three COC Exam Sample Type Questions DBMS
82% (17)
Level Three COC Exam Sample Type Questions DBMS
6 pages
Case Study - Refreshing A CS 10 Test Development System With Fresh Production Data
No ratings yet
Case Study - Refreshing A CS 10 Test Development System With Fresh Production Data
106 pages
Library Management System
No ratings yet
Library Management System
3 pages
Oracle Database Design Final Exam
No ratings yet
Oracle Database Design Final Exam
15 pages
Automatic Storage Management: Why ASM ?
No ratings yet
Automatic Storage Management: Why ASM ?
7 pages
SEO Audit Report 2
No ratings yet
SEO Audit Report 2
24 pages
Informatica Dynamic Lookup Cache
No ratings yet
Informatica Dynamic Lookup Cache
6 pages
Module-2-Introduction To HDFS and Tools
No ratings yet
Module-2-Introduction To HDFS and Tools
38 pages
Assignment Submission and Assessment
0% (1)
Assignment Submission and Assessment
9 pages
Module 1 PDF
No ratings yet
Module 1 PDF
42 pages
Chapter 2
No ratings yet
Chapter 2
39 pages
Nikita Jain Internship Report
No ratings yet
Nikita Jain Internship Report
25 pages
2018 Unit1 Lecture5 HDFS HA
No ratings yet
2018 Unit1 Lecture5 HDFS HA
29 pages
BigData Module 1
No ratings yet
BigData Module 1
17 pages
Aws Index 1
No ratings yet
Aws Index 1
2 pages
Unit-4 Hadoop Distributed File System (HDFS) : Syllabus
No ratings yet
Unit-4 Hadoop Distributed File System (HDFS) : Syllabus
17 pages
Hdfs
No ratings yet
Hdfs
10 pages
Module 7 Lab: Implementing Stored Procedures and Functions
No ratings yet
Module 7 Lab: Implementing Stored Procedures and Functions
9 pages
Correct DW
No ratings yet
Correct DW
9 pages
Big Data - Unit 4
No ratings yet
Big Data - Unit 4
15 pages
Database Concept 1
100% (1)
Database Concept 1
16 pages
Hadoop Fundamentals
No ratings yet
Hadoop Fundamentals
45 pages
Chapter N2 HDFS The Hadoop Distributed File System - Matrix
No ratings yet
Chapter N2 HDFS The Hadoop Distributed File System - Matrix
37 pages
Oracle Webdb 2.2: Montse Collados Polidura SL/CO - April 2000
No ratings yet
Oracle Webdb 2.2: Montse Collados Polidura SL/CO - April 2000
15 pages
File System Basics: Hadoop Distributed
No ratings yet
File System Basics: Hadoop Distributed
22 pages
OpenAI Assistant Slides
No ratings yet
OpenAI Assistant Slides
45 pages
Big Data Assighmwnt 2
No ratings yet
Big Data Assighmwnt 2
60 pages
Hadoop
No ratings yet
Hadoop
23 pages
Cca 410
No ratings yet
Cca 410
7 pages
BDA Mod 3 QB Solns
No ratings yet
BDA Mod 3 QB Solns
19 pages
HDFS Comic
No ratings yet
HDFS Comic
5 pages
4 UNIT-4 Introduction To Hadoop
No ratings yet
4 UNIT-4 Introduction To Hadoop
154 pages
Hadoop File System: B. Ramamurthy
No ratings yet
Hadoop File System: B. Ramamurthy
36 pages
Hadoop File System
No ratings yet
Hadoop File System
36 pages
Unit2 HDFS
No ratings yet
Unit2 HDFS
17 pages
SQL Window Function Cheat Sheet
No ratings yet
SQL Window Function Cheat Sheet
15 pages
Module 1 PDF
No ratings yet
Module 1 PDF
49 pages
Document 4 HDFS
No ratings yet
Document 4 HDFS
8 pages
Unit 3
No ratings yet
Unit 3
44 pages
Data Flow in Hdfs
No ratings yet
Data Flow in Hdfs
7 pages
High Performance Fault-Tolerant Hadoop Distributed File System
No ratings yet
High Performance Fault-Tolerant Hadoop Distributed File System
9 pages
Chapter 3
No ratings yet
Chapter 3
14 pages
Hadoop Working
No ratings yet
Hadoop Working
33 pages
Hdfs Cartoon
No ratings yet
Hdfs Cartoon
5 pages
Unit II-bid Data Programming
No ratings yet
Unit II-bid Data Programming
23 pages
COMP9313: Big Data Management: Hadoop and HDFS
No ratings yet
COMP9313: Big Data Management: Hadoop and HDFS
60 pages
High Performance Fault-Tolerant Hadoop Distributed File System
No ratings yet
High Performance Fault-Tolerant Hadoop Distributed File System
9 pages
Bda - M 2
No ratings yet
Bda - M 2
113 pages
Dual Write - Troubleshooting - 004 - Initial Sync
No ratings yet
Dual Write - Troubleshooting - 004 - Initial Sync
8 pages
Hadoop Mock Test I
No ratings yet
Hadoop Mock Test I
8 pages
Hadoop Distributed File System
No ratings yet
Hadoop Distributed File System
5 pages
Unit 2
No ratings yet
Unit 2
56 pages
Hadoop Architecture
No ratings yet
Hadoop Architecture
48 pages
Chapter 1-Introduction To Data Science
No ratings yet
Chapter 1-Introduction To Data Science
39 pages
Explain in Detail About Hadoop Framework
No ratings yet
Explain in Detail About Hadoop Framework
4 pages
21CS72 Bigdata Module 2 HDFS
No ratings yet
21CS72 Bigdata Module 2 HDFS
55 pages
Bda Unit 5
No ratings yet
Bda Unit 5
17 pages
HDFS Presentation Kunal Yadav
No ratings yet
HDFS Presentation Kunal Yadav
11 pages
HDFS Concepts
No ratings yet
HDFS Concepts
10 pages
UNIT 3 HDFS, Hadoop Environment Part 1
No ratings yet
UNIT 3 HDFS, Hadoop Environment Part 1
9 pages
Unit 2
No ratings yet
Unit 2
14 pages
HDFS
No ratings yet
HDFS
15 pages
Cs744 Snowflake Notes
No ratings yet
Cs744 Snowflake Notes
19 pages
BDH Unit 3
No ratings yet
BDH Unit 3
25 pages
BDA-Unit 4
No ratings yet
BDA-Unit 4
20 pages
Unit-2 CH 1 Updated
No ratings yet
Unit-2 CH 1 Updated
22 pages
Chapter 2 - Log On To Access It
No ratings yet
Chapter 2 - Log On To Access It
3 pages
HDFS
No ratings yet
HDFS
16 pages
A Survey of Fault-Tolerance and Fault-Recovery Techniques in Parallel Systems
No ratings yet
A Survey of Fault-Tolerance and Fault-Recovery Techniques in Parallel Systems
13 pages
BD Module 1 Final
No ratings yet
BD Module 1 Final
17 pages
Unit - 3 (HDFS) - 1
No ratings yet
Unit - 3 (HDFS) - 1
24 pages
Block and Data Size
No ratings yet
Block and Data Size
8 pages
MySQL Queries
No ratings yet
MySQL Queries
3 pages
IJISAE Oct 2024
No ratings yet
IJISAE Oct 2024
12 pages
Unit - 3 (HDFS)
No ratings yet
Unit - 3 (HDFS)
23 pages
Unit - 3 - Big Data
No ratings yet
Unit - 3 - Big Data
66 pages
BCS061 Notes Unit3
No ratings yet
BCS061 Notes Unit3
23 pages
Big Data Lecture # 05
No ratings yet
Big Data Lecture # 05
22 pages
Learn Cassandra in 24 Hours
From Everand
Learn Cassandra in 24 Hours
Alex Nordeen
No ratings yet
Hadoop实际解决方案手册: Chinese Edition
From Everand
Hadoop实际解决方案手册: Chinese Edition
Posts & Telecom Press
No ratings yet
Big Data Analytics
From Everand
Big Data Analytics
Nitin Kumar Yadav
No ratings yet
FreeBSD Mastery: Advanced ZFS: IT Mastery, #9
From Everand
FreeBSD Mastery: Advanced ZFS: IT Mastery, #9
Michael W. Lucas
No ratings yet
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
From Everand
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
Joerg Christian Seubert
No ratings yet

SandeepKumar Das 20020343071

Uploaded by

SandeepKumar Das 20020343071

Uploaded by

Name: Sandeep Kumar Das

Operation: Fault Tolerance Case

1. Data Node failure

3. Name node failure

Available solutions to handle Name node failure in Hadoop 2

A distributed system works on cluster of computers.

Ways for achieving Fault-Tolerance in Hadoop HDFS

You might also like