0% found this document useful (0 votes)

57 views6 pages

Experiment 01 PDF

The experiment aimed to set up a Hadoop single node cluster and compare versions 1.x, 2.x, and 3.x. A single node cluster was successfully set up with the latest Hadoop 3.x version following steps like installing Java, configuring SSH, installing Hadoop, and editing configuration files. Hadoop 1.x introduced MapReduce and HDFS but only supported single tenancy, while 2.x added YARN for better resource management and multi-tenancy. Hadoop 3.x further improved scalability and supports resources beyond CPU and memory.

Uploaded by

Kaushik Shukla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views6 pages

Experiment 01 PDF

Uploaded by

Kaushik Shukla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Experiment 01 -222051017

16 April 2023 13:45

Aim: Setup Hadoop Single node cluster. Compare Hadoop 1.x, 2.x and 3.x

Theory:
The Apache Hadoop software library provides a framework for the distributed processing of massive
data volumes across computer clusters. From a single server to thousands of devices, each providing
local computing and storage, it is intended to scale up. The library itself is designed to identify and
handle problems at the application layer rather than relying on hardware to provide high availability. As
a result, a highly-available service is delivered on top of a cluster of computers, each of which may be
prone to failures.

Implementation:

A) Setup of Hadoop Single Node cluster:

a. Install latest or desired version of java

b. Since I want to manage Hadoop files independently, I create a separate Hadoop user named
'hadoop'. And then switch from angela_user to hadoop user and make sure this newly
created user is a member of the group.

Big Data Lab Page 1

c. Next we configure password-less ssh.

Big Data Lab Page 2

d. Install and Configure Apache Hadoop in hadoop user.

e. Next we configure Java Environment Variables and then Edit core-site.xml, hdfs-site.xml,
mapred-site.xml and yarn-site.xml

Big Data Lab Page 3

f. Format the HDFS NameNode as shown above and validate the Hadoop configuration.
We launch the namenode, datanode, yarn resource and Node Manager.

g. In order to verify the running components, we check jps (java vm process status) as shown
above.
Knowing one’s IP address and Hadoop port will allow access to the Hadoop dashboard.

Example : http://localhost:9870/

Big Data Lab Page 4

Thus a single cluster node is setup.

B) Comparison between Hadoop 1.X vs Hadoop 2.X vs Hadoop 3.X:

Hadoop 1.X Hadoop 2.X Hadoop 3.X (currently installed

above)
Hadoop 1.x was Hadoop 2.x released in 2012 Hadoop 3.x released in 2017
released in 2011
It introduced YARN (Yet another resource In Hadoop 3.x, the YARN resource
MapReduce and HDFS. negotiator) added for better model is generalized to support
That is to say, the resource management. As a result, user-defined resource types
MapReduce frameowrk it enabled multi-tenancy. beyond CPU and memory. For
is used as data Therefore, the same cluster can be example, the administrator can
processing and for used by MapReduce as well as by define resources like GPUs,
resource management some other processes using YARN. software licenses, or locally-
also. attached storage. YARN tasks can
then be scheduled based on the
availability of these resources.
Supports single tenancy Supports multiple tenants using Multiple tenants are supported
only YARN here.
Hadoop 1.x uses Hadoop 2.x is also a Master-Slave It added supports for multiple
Master-Slave architecture. However, this active namenodes
architecture that consists of multiple masters that
consists of a single includes active namenode and
master and multiple standby namenode. So, in this case
slaves. So, in case the if master node get failed then the
master node gets failed standby master node will take over
then the entire clusters it. As a result, hadoop 2.x fixes the
become unavailable. problem of a single point of
failure.
Hadoop 1.x is limited to It supports up to 10000 nodes in a The scalability is improved in
4000 nodes per cluster. cluster. Hadoop 3.x and it can have more
than 10000 nodes in one cluster.

Big Data Lab Page 5

than 10000 nodes in one cluster.
Manual intervention is needed for We don’t need manual intervention
namenode recovery. for namenode recovery.
Java 7 is the minimum supported Java 8 is the minimum supported
version version.
It supports HDFS(default), FTP, All file systems including Microsoft
Amazon S3 and Windows Azure Azure Data Lake filesystem is
Storage Blobs (WASB) file systems. compatible with Hadoop 3.x.
It uses 3x replication scheme that Hadoop 3 uses eraser encoding in
results in 200% storage overhead. HDFS that helps to reduce the
storage overhed. It has 50% storage
overhead only.
It added support for GPU hardware
that can be used to execute deep
leanring algorithms on a Hadoop
cluster.

Diagrammatically the above differences can be represented as :

Conclusion:

Thus, in this experiment, we have set up Hadoop single node cluster and have compared different
versions of Hadoop (v1.x, v2.x and v3.x)

Big Data Lab Page 6

02 Hadoop Architecture and HDFS
100% (1)
02 Hadoop Architecture and HDFS
74 pages
Hadoop Lab Manual
No ratings yet
Hadoop Lab Manual
54 pages
Bigdata Interview Preparation Guide
No ratings yet
Bigdata Interview Preparation Guide
292 pages
Azure Databricks Course Slide Deck
75% (4)
Azure Databricks Course Slide Deck
169 pages
Hadoop 1.0 Vs 2.0
No ratings yet
Hadoop 1.0 Vs 2.0
18 pages
Hadoop Interview Qs
No ratings yet
Hadoop Interview Qs
99 pages
Big Data & Apache Hadoop: Click To Add Text
No ratings yet
Big Data & Apache Hadoop: Click To Add Text
37 pages
Basic Concepts of Hadoop: Karthick Selvam
No ratings yet
Basic Concepts of Hadoop: Karthick Selvam
42 pages
Hadoopfile PP
No ratings yet
Hadoopfile PP
83 pages
Bda 201070046 01
No ratings yet
Bda 201070046 01
24 pages
Experiment No - 01
No ratings yet
Experiment No - 01
14 pages
BDA Lab Assignment 1 PDF
No ratings yet
BDA Lab Assignment 1 PDF
20 pages
Hadoop 1
No ratings yet
Hadoop 1
39 pages
Introduction To Big Data and Hadoop
100% (1)
Introduction To Big Data and Hadoop
29 pages
Hadoop
No ratings yet
Hadoop
27 pages
Bda A1
No ratings yet
Bda A1
15 pages
BDA Unit-4
No ratings yet
BDA Unit-4
38 pages
Lecture 07
No ratings yet
Lecture 07
58 pages
Bda Lab Record
No ratings yet
Bda Lab Record
60 pages
Hadoop Interview1
No ratings yet
Hadoop Interview1
27 pages
Bda A2
No ratings yet
Bda A2
17 pages
2-Hadoop History Terminologies DFS-03-01-2025
No ratings yet
2-Hadoop History Terminologies DFS-03-01-2025
52 pages
Lab 1
No ratings yet
Lab 1
12 pages
Introduction To Hadoop
No ratings yet
Introduction To Hadoop
5 pages
Adobe Scan 05-Nov-2023
No ratings yet
Adobe Scan 05-Nov-2023
9 pages
BDA Unit-4
No ratings yet
BDA Unit-4
38 pages
Module III
No ratings yet
Module III
33 pages
1.mrplab Intro
No ratings yet
1.mrplab Intro
18 pages
Bda PPT M1 P2 1
No ratings yet
Bda PPT M1 P2 1
19 pages
Big Data Apache Spark123
No ratings yet
Big Data Apache Spark123
121 pages
Steps To Install Hadoop 2.x Release (Yarn or Next-Gen) On Single Node Cluster Setup
No ratings yet
Steps To Install Hadoop 2.x Release (Yarn or Next-Gen) On Single Node Cluster Setup
7 pages
Unit IV
No ratings yet
Unit IV
10 pages
BDA LAB Programs
No ratings yet
BDA LAB Programs
56 pages
DC Hadoop
No ratings yet
DC Hadoop
48 pages
Hadoop
No ratings yet
Hadoop
7 pages
Unit III
No ratings yet
Unit III
9 pages
Install and Run Hadoop On Windows
No ratings yet
Install and Run Hadoop On Windows
29 pages
BDA Lab Manual UPDATED
No ratings yet
BDA Lab Manual UPDATED
45 pages
Unit 3-1
No ratings yet
Unit 3-1
14 pages
A48970353 16469 14 2019 Hadoop
No ratings yet
A48970353 16469 14 2019 Hadoop
18 pages
Experiment 1
No ratings yet
Experiment 1
17 pages
Lab Manual
No ratings yet
Lab Manual
27 pages
Apache Hadoop: Getting Started With
No ratings yet
Apache Hadoop: Getting Started With
7 pages
Unix Commands Part 2
No ratings yet
Unix Commands Part 2
37 pages
Bda Record
No ratings yet
Bda Record
27 pages
Experiment 1 Hadoop Installation
No ratings yet
Experiment 1 Hadoop Installation
6 pages
Big Data Lab Manual
No ratings yet
Big Data Lab Manual
32 pages
Unit III
No ratings yet
Unit III
32 pages
Hadoop
No ratings yet
Hadoop
18 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
42 pages
Hadoop 6
No ratings yet
Hadoop 6
5 pages
Introduction To The Big Data Ecosystem
No ratings yet
Introduction To The Big Data Ecosystem
13 pages
Unit 4-1
No ratings yet
Unit 4-1
6 pages
Online:: Setting Up The Environment
No ratings yet
Online:: Setting Up The Environment
9 pages
Hadoop Notes
No ratings yet
Hadoop Notes
8 pages
Assignment 1 Write-Up
No ratings yet
Assignment 1 Write-Up
8 pages
Big Data Unit 3 by Multi Atoms
No ratings yet
Big Data Unit 3 by Multi Atoms
6 pages
9500 MPR Technical Description
100% (1)
9500 MPR Technical Description
90 pages
ZTE OLTs Initial Setup
No ratings yet
ZTE OLTs Initial Setup
3 pages
Blog Submission Guidelines
No ratings yet
Blog Submission Guidelines
5 pages
Website Css
No ratings yet
Website Css
31 pages
Coffee Shop Management Report
No ratings yet
Coffee Shop Management Report
16 pages
UNIT - 5 3D Object Representation
No ratings yet
UNIT - 5 3D Object Representation
59 pages
Unit 5 - Introduction To Hadoop
No ratings yet
Unit 5 - Introduction To Hadoop
50 pages
EDA Manual
No ratings yet
EDA Manual
20 pages
Central Station EX PSC-A128EX3 Installation Manual
No ratings yet
Central Station EX PSC-A128EX3 Installation Manual
96 pages
HeliCopter Report
No ratings yet
HeliCopter Report
30 pages
Bilal Servicenow Developer
No ratings yet
Bilal Servicenow Developer
5 pages
BRKCRS 3810
No ratings yet
BRKCRS 3810
143 pages
Aditya's Resume
No ratings yet
Aditya's Resume
1 page
Operating System 1 Notes
No ratings yet
Operating System 1 Notes
102 pages
Log
No ratings yet
Log
215 pages
Digital Logic and Computer Architecture
No ratings yet
Digital Logic and Computer Architecture
27 pages
Assignment Mega
No ratings yet
Assignment Mega
11 pages
MSA 4th Edition
No ratings yet
MSA 4th Edition
54 pages
nm8sb 1 5
No ratings yet
nm8sb 1 5
4 pages
Netwok Security
No ratings yet
Netwok Security
12 pages
31725H Unit6 Pef20200318
No ratings yet
31725H Unit6 Pef20200318
25 pages
Muqaddas Research Papers
No ratings yet
Muqaddas Research Papers
5 pages
A Study On Social Media and Its Impact On Youth
No ratings yet
A Study On Social Media and Its Impact On Youth
10 pages
Introducing Tracet: Enterprise Fixed Asset Management Software
No ratings yet
Introducing Tracet: Enterprise Fixed Asset Management Software
13 pages
Interactive Map
No ratings yet
Interactive Map
2 pages
Project Conventions Coding Standards Java/Android
No ratings yet
Project Conventions Coding Standards Java/Android
12 pages
Abstract 618 Letter
No ratings yet
Abstract 618 Letter
2 pages
n670x Series Datasheet
No ratings yet
n670x Series Datasheet
3 pages
Analyzing Deviation X, Y, Z Axis CNC
No ratings yet
Analyzing Deviation X, Y, Z Axis CNC
9 pages

Experiment 01 PDF

Uploaded by

Experiment 01 PDF

Uploaded by

Experiment 01 -222051017

16 April 2023 13:45

A) Setup of Hadoop Single Node cluster:

a. Install latest or desired version of java

Big Data Lab Page 1

Big Data Lab Page 2

Big Data Lab Page 3

Big Data Lab Page 4

B) Comparison between Hadoop 1.X vs Hadoop 2.X vs Hadoop 3.X:

Hadoop 1.X Hadoop 2.X Hadoop 3.X (currently installed

Big Data Lab Page 5

Diagrammatically the above differences can be represented as :

Big Data Lab Page 6

You might also like