0% found this document useful (0 votes)
124 views18 pages

Hadoop 1.0 Vs 2.0

This document provides an overview of Apache Hadoop versions 1.0 and 2.0. It describes the key components of Hadoop including HDFS for storage, MapReduce for processing, and the master-slave architecture with a JobTracker and TaskTrackers. It outlines limitations in Hadoop 1.0 like single points of failure and resource utilization issues. The document then introduces YARN which was created in Hadoop 2.0 to address these limitations by separating resource management from job scheduling.

Uploaded by

Piyush Jangir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
124 views18 pages

Hadoop 1.0 Vs 2.0

This document provides an overview of Apache Hadoop versions 1.0 and 2.0. It describes the key components of Hadoop including HDFS for storage, MapReduce for processing, and the master-slave architecture with a JobTracker and TaskTrackers. It outlines limitations in Hadoop 1.0 like single points of failure and resource utilization issues. The document then introduces YARN which was created in Hadoop 2.0 to address these limitations by separating resource management from job scheduling.

Uploaded by

Piyush Jangir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Presented By:

KALAI SELVI PIYUSH JANGIR


2015272013 2015272053

1
Introduction
Apache Hadoop 1.0 vs 2.0
HDFS
Map Reduce
Master-Slave Architecture
Limitation in Hadoop 1.0
Yarn
References

2
Open source software framework designed for
storage and processing of large scale data on
clusters of commodity hardware

Created by Doug Cutting and Mike Carafella .

Cutting named the program after his sons toy


elephant.

The core of Apache Hadoop consists of a storage


part, known as Hadoop Distributed File
System (HDFS), and a processing part called Map
Reduce.

3
4
Architecture

5
HDFS

6
Responsible for storing data on the cluster

Data files are split into blocks and distributed


across the nodes in the cluster

Each block is replicated multiple times

7
Default replication is 3-fold

8
Distributing computation
across nodes

9
A method for distributing computation across
multiple nodes

Each node processes the data that is stored at


that node

Consists of two main phases


Map
Reduce

the reduce task is always performed after the map job.

10
Takes a set of data and broken down into tuples

Takes the output from a map as an input

Combines those data tuples into a smaller set of


tuples.

11
12
Master Slave Architecture

13
Name Node
Stores metadata for the files, like the directory
structure
Handles creation of more replica blocks when
necessary after a DataNode failure

Data Node
Stores the actual data in HDFS

14
JobTracker
splits up data process into smaller tasks and sends
it to the TaskTracker process in each node

TaskTracker
reports back to the JobTracker node and reports on
job progress, sends data or requests new jobs

15
Scalability: JobTracker runs on single machine doing
several task like
Resource management Job scheduling Monitoring

Availability Issue: In Hadoop 1.0, JobTracker is single Point


of availability. This means if JobTracker fails, all jobs must
restart.

Problem with Resource Utilization: In Hadoop 1.0, there is


concept of predefined number of map slots and reduce
slots for each TaskTrackers. Resource Utilization issues
occur because maps slots might be full while reduce slots
is empty (and vice-versa).

16
http://hortonworks.com/apache/yarn/#secti
on_2
http://saphanatutorial.com/how-yarn-
overcomes-mapreduce-limitations-in-
hadoop-2-0/
http://www.slideshare.net/emcacademics/mil
ind-hadoop-trainingbrazil
https://en.wikipedia.org/wiki/Apache_Hadoo
p

17
18

You might also like