0% found this document useful (0 votes)

62 views9 pages

Online:: Setting Up The Environment

The document provides steps to set up a single node Hadoop cluster on Ubuntu including installing Java, configuring SSH access, installing Hadoop, configuring required files, and formatting and starting the HDFS filesystem. Key steps include installing Java 1.7 or later, generating SSH keys, downloading and extracting Hadoop, modifying configuration files, and running Hadoop commands to start the filesystem.

Uploaded by

Januari Saga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views9 pages

Online:: Setting Up The Environment

Uploaded by

Januari Saga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Hadoop 2.

x Setup –Veeraravi Kumar Singiri, +91 9986755336

Setting up the environment:

In this tutorial you will know step by step process for setting up a Hadoop Single Node cluster, so that you can play
around with the framework and learn more about it.

In This tutorial we are using following Software versions, you can download same by clicking the hyperlinks:
 Ubuntu Linux 12.04.3 LTS (steps are same for any version)
 Hadoop 2.6.4, (steps are same for Any version)
Prerequisites:
1. Installing Java v1.7 or later version
2. Adding dedicated Hadoop system user.
3. Configuring SSH access.
Before starting of installing any applications or softwares, please makes sure your list of packages from all repositories
and PPA’s is up to date or if not update them by using this command:
sudo apt-get update

1. Installing Java v1.7:

For running Hadoop it requires Java v1. 6+ but use latest version

Note: There are multiple ways to install Java in linux machine.

Online: if you are connected to internet, the java installation is simple

Step 1) sudo apt-get install openjdk-8-jdk (it will ask for password please enter you login or root password)

sudo apt-get install openjdk-8-jdk ## to install JDK

sudo apt-get install openjdk-8-jre ## to install JRE

Step 2) execute update-java-alternatives -l ## will tell which all java versions installed if there are more than one.
and also would give path for installation for example

/usr/lib/jvm/java-1.8.0-openjdk-amd64

Step 3) Follow the above steps mentioned under section steps to update bashrc file above to update the bashrc file.

Step 4) use java –version to verify the JAVA

Steps to edit bashrc file:

Step 5) Edit the .bashrc file for your user as follows:

To enter into .bashrc file, please follow the below steps.

1. cd and press enter.
2. gedit .bashrc or vi .bashrc (if you have knowledge about VI editor, use VI cmd)
Hadoop 2.x Setup –Veeraravi Kumar Singiri, +91 9986755336

if it doesnt open then execute export DISPLAY=:0.0 then try gedit .bashrc again. it will open bash rc in
another window (text editior)
3. go to end of the file and give java home path

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64

export PATH=$PATH:$JAVA_HOME/bin

Step 6) save the file and close .bashrc file., and Run source .bashrc command for update the classpath.

$source .bashrc

Step 7) Verify the java version by running the following command

$ java -version (it will display the version of java which you have installed)

java version "1.8.0_X"

Java(TM) SE Runtime Environment (build 1.8.0_X)

================JAVA INSTALLATION COMPLETED==========================

Note: in this document, where ever you find username, you need to replace with your Ubuntu
machine username.
3. Configuring SSH access:
The need for SSH Key based authentication is required so that the master node can then login to slave nodes (and the
secondary node) to start/stop them and also local machine if you want to use Hadoop with it. For our single-node setup
of Hadoop, we therefore need to configure SSH access to localhost for the (user username) user we created in the
previous section.

Before this step you have to make sure that SSH is up and running on your machine and configured it to allow SSH public
key authentication.

Generating an SSH key for the your username is user which we have used in setup, it may be different in your machine).

sudo apt-get install openssh-server

a. Login as your username with sudo su – username (your username)

b. Run this Key generation command:
Hadoop 2.x Setup –Veeraravi Kumar Singiri, +91 9986755336

ssh-keygen -t rsa -P ""

c. It will ask to provide the file name in which to save the key, just press has entered so that it will generate the key at
‘/home/username/ .ssh’
d. Enable SSH access to your local machine with this newly created key.
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

e. change the permission for /authorized_keys

$ chmod 0650 ~/.ssh/authorized_keys

$ exit

f. The final step is to test the SSH setup by connecting to your local machine with the (username) user.
ssh username@localhost or ssh localhost
This will add localhost permanently to the list of known hosts

4. Disabling IPv6. (Not required)

We need to disable IPv6 because Ubuntu is using 0.0.0.0 IP for different Hadoop configurations. You will need to run the
following commands using a root account:

sudo gedit /etc/sysctl.conf

Add the following lines to the end of the file and reboot the machine, to update the configurations correctly.

#disable ipv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
Hadoop 2.x Setup –Veeraravi Kumar Singiri, +91 9986755336

Hadoop Installation:
Go to terminal, type following command.

sudo chown –R username:groupname /usr/local/

sudo chmod –R 777 /usr/local/

mkdir -R /usr/local/hadoop-env

Go to Apache Downloads and download Hadoop version 2.6.4 (prefer to download any stable versions)
i. Download hadoop from https://archive.apache.org/dist/hadoop/core/hadoop-2.6.4/ (prefer to download any
stable versions) and copy the hadoop-2.6.4.tar.gz to /usr/local/hadoop-env
ii. Unpack the compressed hadoop file by using this command: go to /usr/local/hadoop-env

tar –xvzf hadoop-2.6.4.tar.gz

Configuring Hadoop:
The following are the required files we will use for the perfect configuration of the single node Hadoop cluster.
a. yarn-site.xml:
b. core-site.xml
c. mapred-site.xml
d. hdfs-site.xml
e. hadoop-env.sh
f. Update $HOME/.bashrc
We can find the list of files in Hadoop directory which is located in

cd /usr/local/hadoop-env/hadoop-2.6.4/etc/hadoop

Note: Select the above listed file and right-click and open as textfiles and modify with respective
configurations.

a.yarn-site.xml:
<configuration>

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
Hadoop 2.x Setup –Veeraravi Kumar Singiri, +91 9986755336

</property>
</configuration>

b. core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

c. mapred-site.xml:
If this file does not exist, copy mapred-site.xml.template and paste in same location and update the name as mapred-
site.xml
i. Edit the mapred-site.xml file
ii. Add the following entry to the file and save and quit the file.
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

d. hdfs-site.xml:
i. Edit the hdfs-site.xml file
ii. Create two directories to be used by namenode and datanode. (no need to create them, it will get create by Hadoop
when you execute Hadoop namenode -format)
mkdir -p $HADOOP_HOME/yarn_data/hdfs/namenode
sudo mkdir -p $HADOOP_HOME/yarn_data/hdfs/namenode
mkdir -p $HADOOP_HOME/yarn_data/hdfs/datanode
Hadoop 2.x Setup –Veeraravi Kumar Singiri, +91 9986755336

iii. Add the following entry to the file and save and quit the file:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/yarn_data/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/yarn_data/hdfs/datanode</value>
</property>
</configuration>

e. Hadoop-env.sh:
update the java_home in this file.

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64

f. Update $HOME/.bashrc

i. Go back to the home directory by cd cmd and edit the .bashrc file.
vi .bashrc

ii. Add the following lines to the end of the file.

Add below configurations:
# Set Hadoop-related environment variables
export HADOOP_PREFIX=/usr/local/hadoop-env/hadoop-2.6.4
export HADOOP_HOME=/usr/local/hadoop-env/hadoop-2.6.4
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
Hadoop 2.x Setup –Veeraravi Kumar Singiri, +91 9986755336

export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
# Native Path
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib"
#Java path
# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

Formatting and Starting/Stopping the HDFS filesystem via the NameNode:

i. The first step to starting up your Hadoop installation is formatting the Hadoop filesystem which is implemented on top
of the local filesystem of your cluster. You need to do this the first time you set up a Hadoop cluster. Do not format a
running Hadoop filesystem as you will lose all the data currently in the cluster (in HDFS). To format the filesystem
(which simply initializes the directory specified by the dfs.name.dir variable), run the

hadoop namenode –format

vi. We can also start the Hadoop by using below commands.

Hadoop 2.x Setup –Veeraravi Kumar Singiri, +91 9986755336

start-dfs.sh
start-yarn.sh
or
You can start all the services by using start-all.sh
Note: Run following command to check whether the services are running or not
>jps
It will display the following services:
NameNode
DataNode
Secondary NameNode
ResourceManager
NodeManager

Hadoop Web Interfaces:

Hadoop comes with several web interfaces which are by default available at these locations:
 HDFS Namenode and check health using http://localhost:50070
 HDFS Secondary Namenode status using http://localhost:50090

By this we are done in setting up a single node hadoop cluster v2.2.0, hope this step by step guide helps you to setup
same environment at your end.
v. Stop Hadoop by running the following command

stop-dfs.sh
stop-yarn.sh
=============Hadoop setup completed======================
Hadoop 2.x Setup –Veeraravi Kumar Singiri, +91 9986755336

To start individual services, follow cmd.

ii. Start Hadoop Daemons by running the following commands:
Name node:
$ hadoop-daemon.sh start namenode
Data node:
$ hadoop-daemon.sh start datanode
Resource Manager:

$ yarn-daemon.sh start resourcemanager

Node Manager:

$ yarn-daemon.sh start nodemanager

Job History Server:

$ mr-jobhistory-daemon.sh start historyserver

Divinity Activation Mantras Empowerment
0% (2)
Divinity Activation Mantras Empowerment
2 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
80 pages
AICTE SPONSORED Faculty Development Programme (FDP) On "DATA SCIENCE RESEARCH AND BIG DATA ANALYTICS"
No ratings yet
AICTE SPONSORED Faculty Development Programme (FDP) On "DATA SCIENCE RESEARCH AND BIG DATA ANALYTICS"
28 pages
Hydraulic Seals
100% (10)
Hydraulic Seals
358 pages
Hadoop 3 Installation
No ratings yet
Hadoop 3 Installation
10 pages
Boom Placer Spare Parts Manual Sp1420 RMC - 80009056 - 0, Edition - Dec '18
No ratings yet
Boom Placer Spare Parts Manual Sp1420 RMC - 80009056 - 0, Edition - Dec '18
104 pages
Single Node Hadoop Cluster
No ratings yet
Single Node Hadoop Cluster
9 pages
God of War Ghost of Sparta
100% (1)
God of War Ghost of Sparta
32 pages
Hadoop 2.7.3 Setup On Ubuntu 15.10
No ratings yet
Hadoop 2.7.3 Setup On Ubuntu 15.10
7 pages
Hadoop Installation Step by Step
No ratings yet
Hadoop Installation Step by Step
8 pages
213nt1306 - Big Data Analytics Lab Manual
No ratings yet
213nt1306 - Big Data Analytics Lab Manual
80 pages
Hadoop Installation Steps
100% (1)
Hadoop Installation Steps
6 pages
Experiment No - 1
No ratings yet
Experiment No - 1
13 pages
SANS MGT414 10 Course Book
No ratings yet
SANS MGT414 10 Course Book
100 pages
EX. NO Date Program NO Sign
No ratings yet
EX. NO Date Program NO Sign
80 pages
Big Data File
No ratings yet
Big Data File
32 pages
2016 09 05 Raspberry Pi Hadoop Setup v1
No ratings yet
2016 09 05 Raspberry Pi Hadoop Setup v1
18 pages
Hadoop Installation
No ratings yet
Hadoop Installation
6 pages
Single Node Cluster Creation in AWS Educate EC2
No ratings yet
Single Node Cluster Creation in AWS Educate EC2
4 pages
Hadoop Install
No ratings yet
Hadoop Install
19 pages
Setup 7
No ratings yet
Setup 7
11 pages
Edureka Apache Hadoop Single Node Cluster On Ubuntu
No ratings yet
Edureka Apache Hadoop Single Node Cluster On Ubuntu
9 pages
Installation of Hadoop in Ubuntu
No ratings yet
Installation of Hadoop in Ubuntu
15 pages
BDAO
No ratings yet
BDAO
23 pages
BDA LAB Programs
No ratings yet
BDA LAB Programs
56 pages
How To Install Hadoop On Ubuntu 18.04 or 20.04
No ratings yet
How To Install Hadoop On Ubuntu 18.04 or 20.04
15 pages
PRACTICAL 4 - Single and Multi Node Hadoop Install
No ratings yet
PRACTICAL 4 - Single and Multi Node Hadoop Install
11 pages
Experiment 1 Hadoop Installation
No ratings yet
Experiment 1 Hadoop Installation
6 pages
Single Node Cluster
No ratings yet
Single Node Cluster
31 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
45 pages
Hadoop Single Node Installation
No ratings yet
Hadoop Single Node Installation
7 pages
Install Hadoop
No ratings yet
Install Hadoop
8 pages
A Report On Distributed Computing
No ratings yet
A Report On Distributed Computing
25 pages
Hadoop Installatio1
No ratings yet
Hadoop Installatio1
22 pages
Hbase Installationn
No ratings yet
Hbase Installationn
12 pages
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
27 pages
Updated CMD
No ratings yet
Updated CMD
23 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
6 pages
Installing Standalone and Pseudocode Hadoop Cluster: 1. Setting Up Vmware Virtual Machine
No ratings yet
Installing Standalone and Pseudocode Hadoop Cluster: 1. Setting Up Vmware Virtual Machine
14 pages
Bda Record
No ratings yet
Bda Record
27 pages
Bda Lab
No ratings yet
Bda Lab
37 pages
Hadoop Installation Manual 2.odt
No ratings yet
Hadoop Installation Manual 2.odt
20 pages
Exp 1 1
No ratings yet
Exp 1 1
24 pages
Installing A Single Node Hadoop Cluster
No ratings yet
Installing A Single Node Hadoop Cluster
4 pages
BDA Practical
No ratings yet
BDA Practical
38 pages
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
7 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
49 pages
Step 1 - Install Oracle Java 8 On Ubuntu
No ratings yet
Step 1 - Install Oracle Java 8 On Ubuntu
7 pages
CP5261Data Analytics Laboratory
No ratings yet
CP5261Data Analytics Laboratory
57 pages
Hadoop Installaion
No ratings yet
Hadoop Installaion
113 pages
Hadoop 2 - Pseudo Node Installation
No ratings yet
Hadoop 2 - Pseudo Node Installation
9 pages
TP2 - 3IM - en
No ratings yet
TP2 - 3IM - en
7 pages
BDA Lab Manual-1
No ratings yet
BDA Lab Manual-1
60 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
8 pages
Hadoop Installation
No ratings yet
Hadoop Installation
12 pages
$ Sudo Apt-Get Install Oracle-Java8-Installer
No ratings yet
$ Sudo Apt-Get Install Oracle-Java8-Installer
4 pages
Java-Hadoop 2.X Setting Up
No ratings yet
Java-Hadoop 2.X Setting Up
12 pages
Anurag 1-6 Merged
No ratings yet
Anurag 1-6 Merged
60 pages
CESC Summative Test No.2
No ratings yet
CESC Summative Test No.2
4 pages
BDA Lab Manual UPDATED
No ratings yet
BDA Lab Manual UPDATED
45 pages
Hadoop Installation
No ratings yet
Hadoop Installation
5 pages
Unix Commands Part 2
No ratings yet
Unix Commands Part 2
37 pages
Experiment-2 BDA Lab
No ratings yet
Experiment-2 BDA Lab
13 pages
Bdamanual
No ratings yet
Bdamanual
8 pages
Lab Manual
No ratings yet
Lab Manual
27 pages
Chapter 2 Basic Physics of Semiconductors
No ratings yet
Chapter 2 Basic Physics of Semiconductors
42 pages
Insurance Awareness Handouts - Basics of Insurance
No ratings yet
Insurance Awareness Handouts - Basics of Insurance
8 pages
Rotax 912 Operator's Manual
No ratings yet
Rotax 912 Operator's Manual
85 pages
Taxi Book
No ratings yet
Taxi Book
4 pages
Catalogo Bujías Gauss
No ratings yet
Catalogo Bujías Gauss
32 pages
Data (Prod & Admin) - July 2023 - August
No ratings yet
Data (Prod & Admin) - July 2023 - August
332 pages
Tax Problems
No ratings yet
Tax Problems
3 pages
Blood Angels Army List (2000)
No ratings yet
Blood Angels Army List (2000)
1 page
50 Years of The Future
100% (1)
50 Years of The Future
25 pages
BYK E-Prospectus of PDF
No ratings yet
BYK E-Prospectus of PDF
9 pages
Ka & TN Cbse (c3 To c5) C Batch BWT - 7 Syllabus (19.02.2024)
No ratings yet
Ka & TN Cbse (c3 To c5) C Batch BWT - 7 Syllabus (19.02.2024)
2 pages
Origin of HAZOP Analysis
No ratings yet
Origin of HAZOP Analysis
5 pages
Attachment 3
No ratings yet
Attachment 3
11 pages
Random Details
No ratings yet
Random Details
2 pages
Hardening
No ratings yet
Hardening
7 pages
Stephen Hawking - 'Transcendence Looks at The Implications of Artificial Intelligence - But Are We Taking AI Seriously Enough - X 4'
No ratings yet
Stephen Hawking - 'Transcendence Looks at The Implications of Artificial Intelligence - But Are We Taking AI Seriously Enough - X 4'
2 pages
FT-14D Digital Flexitest™ Switch
No ratings yet
FT-14D Digital Flexitest™ Switch
4 pages
A FAREWELL TO VIROLOGY (EXPERT EDITION) DR Mark Bailey
No ratings yet
A FAREWELL TO VIROLOGY (EXPERT EDITION) DR Mark Bailey
67 pages
REFERENCES
No ratings yet
REFERENCES
7 pages
Continuity at A Point
No ratings yet
Continuity at A Point
20 pages
Subtitle
No ratings yet
Subtitle
4 pages
Key To Corrections - LEVEL 2 MODULE 3
No ratings yet
Key To Corrections - LEVEL 2 MODULE 3
10 pages
Korea University Urban Planning and Urban Design Lab
No ratings yet
Korea University Urban Planning and Urban Design Lab
4 pages
Circuit Design Powerful Blad Tinkercad
No ratings yet
Circuit Design Powerful Blad Tinkercad
1 page
Configuration of a Simple Samba File Server, Quota and Schedule Backup
From Everand
Configuration of a Simple Samba File Server, Quota and Schedule Backup
Dr. Hedaya Alasooly
No ratings yet

Online:: Setting Up The Environment

Uploaded by

Online:: Setting Up The Environment

Uploaded by

Hadoop 2.

x Setup –Veeraravi Kumar Singiri, +91 9986755336

Setting up the environment:

1. Installing Java v1.7:

Note: There are multiple ways to install Java in linux machine.

Online: if you are connected to internet, the java installation is simple

sudo apt-get install openjdk-8-jdk ## to install JDK

sudo apt-get install openjdk-8-jre ## to install JRE

Step 4) use java –version to verify the JAVA

Steps to edit bashrc file:

To enter into .bashrc file, please follow the below steps.

Step 7) Verify the java version by running the following command

java version "1.8.0_X"

Java(TM) SE Runtime Environment (build 1.8.0_X)

================JAVA INSTALLATION COMPLETED==========================

sudo apt-get install openssh-server

a. Login as your username with sudo su – username (your username)

ssh-keygen -t rsa -P ""

e. change the permission for /authorized_keys

$ chmod 0650 ~/.ssh/authorized_keys

4. Disabling IPv6. (Not required)

sudo gedit /etc/sysctl.conf

sudo chown –R username:groupname /usr/local/

sudo chmod –R 777 /usr/local/

tar –xvzf hadoop-2.6.4.tar.gz

ii. Add the following lines to the end of the file.

Formatting and Starting/Stopping the HDFS filesystem via the NameNode:

hadoop namenode –format

vi. We can also start the Hadoop by using below commands.

Hadoop Web Interfaces:

To start individual services, follow cmd.

$ yarn-daemon.sh start resourcemanager

$ yarn-daemon.sh start nodemanager

Job History Server:

$ mr-jobhistory-daemon.sh start historyserver

You might also like