0% found this document useful (0 votes)

12 views18 pages

Hadoop Installation Guide

This document is a comprehensive guide for installing Hadoop on an Ubuntu system, detailing each step from installing the Java Development Kit to starting the Hadoop cluster. It includes commands for setting up Java, SSH, creating a Hadoop user, configuring Hadoop settings, and accessing the Hadoop web interface. The guide concludes with instructions on verifying the Hadoop installation and stopping the services.

Uploaded by

mercy joyce

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views18 pages

Hadoop Installation Guide

Uploaded by

mercy joyce

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Hadoop Installation Guide https://kongu.edu/support/hadoop/index.

html

HADOOP INSTALLATION
GUIDE

Welcome to your comprehensive guide for setting up Hadoop and embarking on your
journey into the world of Big Data! Below, I've included a step-by-step guide that will help
you install Hadoop on your system. Let's dive right in!

1 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

Step 1: Install Java Development Kit

To start, you'll need to install the Java Development Kit (JDK) on your Ubuntu system. The
default Ubuntu repositories offer both Java 8 and Java 11, but it's recommended to use
Java 8 for compatibility with Hive. You can use the following command to install it:

sudo apt update && sudo apt install openjdk-8-jdk

Copy

Step 2: Verify Java Version

Once the Java Development Kit is successfully installed, you should check the version to
ensure it's working correctly:

java -version

Copy

Output:

Step 3: Install SSH

SSH (Secure Shell) is crucial for Hadoop, as it facilitates secure communication between
nodes in the Hadoop cluster. This is essential for maintaining data integrity and
confidentiality and enabling efficient distributed data processing across the cluster:

sudo apt install ssh

Copy

2 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

Step 4: Create the Hadoop User

You must create a user specifically for running Hadoop components. This user will also be
used to log in to Hadoop's web interface. Run the following command to create the user
and set a password:

sudo adduser hadoop

Copy

Output:

Step 5: Switch User

Switch to the newly created 'hadoop' user using the following command:

su - hadoop

Copy

Step 6: Configure SSH

Next, you should set up password-less SSH access for the 'Hadoop' user to streamline

3 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

the authentication process. You'll generate an SSH keypair for this purpose. This avoids
the need to enter a password or passphrase each time you want to access the Hadoop
system:

ssh-keygen -t rsa

Copy

Output:

Step 7: Set Permissions

Copy the generated public key to the authorized key file and set the proper permissions:

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

chmod 640 ~/.ssh/authorized_keys

Copy

Step 8: SSH to the localhost

4 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

You will be asked to authenticate hosts by adding RSA keys to known hosts. Type 'yes'
and hit Enter to authenticate the localhost:

ssh localhost

Copy

Output:

Step 9: Switch User

Switch to the 'hadoop' user again using the following command:

su - hadoop

Copy

Step 10: Install Hadoop

5 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

To begin, download Hadoop version 3.3.6 using the 'wget' command:

wget https://dlcdn.apache.org/hadoop/common/hadoop-3.3.6/hadoop-3.3.6.tar.gz

Copy

Once the download is complete, extract the contents of the downloaded file using the 'tar'
command. Optionally, you can rename the extracted folder to 'hadoop' for easier
configuration:

tar -xvzf hadoop-3.3.6.tar.gz

mv hadoop-3.3.6 hadoop

Copy

Next, you need to set up environment variables for Java and Hadoop in your system.
Open the '~/.bashrc' Could you file in your preferred text editor? If you're using 'nano,' you
can paste code with 'Ctrl+Shift+V,' save with 'Ctrl+X,' 'Ctrl+Y,' and hit 'Enter':

nano ~/.bashrc

Copy

Append the following lines to the file:

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HADOOP_HOME=/home/hadoop/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"

Copy

6 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

Output:

Load the above configuration into the current environment:

source ~/.bashrc

Copy

Additionally, you should configure the 'JAVA_HOME' in the 'hadoop-env.sh' file. Edit this
file with a text editor:

nano $HADOOP_HOME/etc/hadoop/hadoop-env.sh

Copy

Search for the “export JAVA_HOME” and configure it .

JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

Copy

Output:

7 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

Step 11: Configuring Hadoop

Create the namenode and datanode directories within the 'hadoop' user's home directory
using the following commands:

cd hadoop/

mkdir -p ~/hadoopdata/hdfs/{namenode,datanode}

Copy

Next, edit the 'core-site.xml' file and replace the name with your system hostname:

nano $HADOOP_HOME/etc/hadoop/core-site.xml

Copy

<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>

8 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

</property>
</configuration>

Copy

Output:

Save and close the file. Then, edit the 'hdfs-site.xml' file:

Next, edit the 'hdfs-site.xml' file and replace the name with your system hostname:

nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml

Copy

Change the NameNode and DataNode directory paths as shown below:

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>

9 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/datanode</value>
</property>
</configuration>

Copy

Output:

10 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

Save and close the file. Then, edit the 'mapred-site.xml' file:

nano $HADOOP_HOME/etc/hadoop/mapred-site.xml

Copy

Make the following changes:

<configuration>
<property>
<name>yarn.app.mapreduce.am.env</name>

11 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

<value>HADOOP_MAPRED_HOME=$HADOOP_HOME/home/hadoop/hadoop/bin/hadoop</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME/home/hadoop/hadoop/bin/hadoop</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/datanode</value>
</property>
</configuration>

Copy

Output:

Finally, edit the 'yarn-site.xml' file:

nano $HADOOP_HOME/etc/hadoop/yarn-site.xml

Copy

Make the following changes:

12 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>

Copy

Output:

Step 12: Start Hadoop Cluster

Before starting the Hadoop cluster, you need to format the Namenode as the 'hadoop'
user. Format the Hadoop Namenode with the following command:

hdfs namenode -format

Copy

13 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

Output:

Once the Namenode directory is successfully formatted with the HDFS file system, you
will see the message "Storage directory /home/hadoop/hadoopdata/hdfs/namenode has
been successfully formatted." Start the Hadoop cluster using:

start-all.sh

Copy

Output:

You can check the status of all Hadoop services using the command:

jps

Copy

14 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

Output:

Step 13: Access Hadoop Namenode and Resource

Manager

First, determine your IP address by running:

ifconfig

Copy

If needed, install 'net-tools' using:

sudo apt install net-tools

Copy

To access the Namenode, open your web browser and visit http://your-server-ip:9870.
Replace 'your-server-ip' with your actual IP address. You should see the Namenode web
interface.

Output:

15 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

To access the Resource Manager, open your web browser and visit the URL http://your-
server-ip:8088. You should see the following screen:

Output:

Step 14: Verify the Hadoop Cluster

16 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

The Hadoop cluster is installed and configured. Next, we will create some directories in
the HDFS filesystem to test Hadoop. Create directories in the HDFS filesystem using the
following command:

hdfs dfs -mkdir /test1

Copy

hdfs dfs -mkdir /logs

Copy

Next, run the following command to list the above directory:

hdfs dfs -ls /

Copy

You should get the following output:

Also, put some files into the Hadoop file system. For example, put log files from the host
machine into the Hadoop file system:

hdfs dfs -put /var/log/* /logs/

Copy

You can also verify the above files and directories in the Hadoop web interface. Go to the
web interface, click on Utilities => Browse the file system. You should see the directories
you created earlier on the following screen:

17 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html

Step 15: To Stop Hadoop Services

To stop the Hadoop service, run the following command as a Hadoop user:

stop-all.sh

Copy

Output:

In summary, you've learned how to install Hadoop on Ubuntu. Now, you're ready to unlock
the potential of big data analytics. Happy exploring!

18 of 18 3/5/2025, 11:30 AM

Embedded Coder Users Guide R2018a PDF
No ratings yet
Embedded Coder Users Guide R2018a PDF
3,766 pages
213nt1306 - Big Data Analytics Lab Manual
No ratings yet
213nt1306 - Big Data Analytics Lab Manual
80 pages
AICTE SPONSORED Faculty Development Programme (FDP) On "DATA SCIENCE RESEARCH AND BIG DATA ANALYTICS"
No ratings yet
AICTE SPONSORED Faculty Development Programme (FDP) On "DATA SCIENCE RESEARCH AND BIG DATA ANALYTICS"
28 pages
Experiment No - 1
No ratings yet
Experiment No - 1
13 pages
Hadoop 3 Installation
No ratings yet
Hadoop 3 Installation
10 pages
Hadoop Installation Steps
100% (1)
Hadoop Installation Steps
6 pages
Anurag 1-6 Merged
No ratings yet
Anurag 1-6 Merged
60 pages
Aryan
No ratings yet
Aryan
60 pages
Text Determination in SAP SD
100% (1)
Text Determination in SAP SD
12 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
49 pages
Emc VNX Snapshots: White Paper
No ratings yet
Emc VNX Snapshots: White Paper
56 pages
DAN Lab ManuaL
No ratings yet
DAN Lab ManuaL
53 pages
Hadoop Installation Step by Step
No ratings yet
Hadoop Installation Step by Step
8 pages
BDA Lab Manual UPDATED
No ratings yet
BDA Lab Manual UPDATED
45 pages
BigData Lab Manual
No ratings yet
BigData Lab Manual
44 pages
Big Data Lab Record
No ratings yet
Big Data Lab Record
30 pages
Bigdatamanual
No ratings yet
Bigdatamanual
45 pages
Big Data File
No ratings yet
Big Data File
32 pages
Big Data Lab Manual
No ratings yet
Big Data Lab Manual
32 pages
BDA LAB Programs
No ratings yet
BDA LAB Programs
56 pages
Big Data
No ratings yet
Big Data
32 pages
BDA Practical
No ratings yet
BDA Practical
38 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
45 pages
Hadoop Installation Guide
No ratings yet
Hadoop Installation Guide
18 pages
BDAO
No ratings yet
BDAO
23 pages
Installation of Hadoop in Ubuntu
No ratings yet
Installation of Hadoop in Ubuntu
15 pages
Bda Record
No ratings yet
Bda Record
27 pages
Big Data Analytics - Lab-Manual
No ratings yet
Big Data Analytics - Lab-Manual
19 pages
Big Data Analytics Lab Experiments
No ratings yet
Big Data Analytics Lab Experiments
16 pages
Hive INstallation
No ratings yet
Hive INstallation
13 pages
How To Install Hadoop On Ubuntu 18.04 or 20.04
No ratings yet
How To Install Hadoop On Ubuntu 18.04 or 20.04
20 pages
Hbase Installationn
No ratings yet
Hbase Installationn
12 pages
Hadoop Install
No ratings yet
Hadoop Install
19 pages
Bda Lab
No ratings yet
Bda Lab
37 pages
Experiment-2 BDA Lab
No ratings yet
Experiment-2 BDA Lab
13 pages
2023MCS320004 HEMANTH TARRA - Hadoop Installation - Assignment
No ratings yet
2023MCS320004 HEMANTH TARRA - Hadoop Installation - Assignment
9 pages
How To Install Hadoop On Ubuntu 18.04 or 20.04
No ratings yet
How To Install Hadoop On Ubuntu 18.04 or 20.04
15 pages
A Report On Distributed Computing
No ratings yet
A Report On Distributed Computing
25 pages
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
27 pages
Hadoop Configuration
No ratings yet
Hadoop Configuration
12 pages
Updated CMD
No ratings yet
Updated CMD
23 pages
Install Hadoop
No ratings yet
Install Hadoop
8 pages
Hadoop Installation
No ratings yet
Hadoop Installation
7 pages
Unix Commands Part 2
No ratings yet
Unix Commands Part 2
37 pages
Single Node Hadoop Cluster
No ratings yet
Single Node Hadoop Cluster
9 pages
Hadoop Installation
No ratings yet
Hadoop Installation
6 pages
BDA Practical1 MC18-23
No ratings yet
BDA Practical1 MC18-23
17 pages
Bdamanual
No ratings yet
Bdamanual
8 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
6 pages
Hadoop Installation Manual 2.odt
No ratings yet
Hadoop Installation Manual 2.odt
20 pages
Install Sqoop
No ratings yet
Install Sqoop
7 pages
Installationof Hadoop 3
No ratings yet
Installationof Hadoop 3
6 pages
Hadoop Installation
No ratings yet
Hadoop Installation
4 pages
Online:: Setting Up The Environment
No ratings yet
Online:: Setting Up The Environment
9 pages
Experiment 1 Hadoop Installation
No ratings yet
Experiment 1 Hadoop Installation
6 pages
Hadoop 2 - Pseudo Node Installation
No ratings yet
Hadoop 2 - Pseudo Node Installation
9 pages
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
7 pages
Sqoop Tutorial: Sqoop: "SQL To Hadoop and Hadoop To SQL"
No ratings yet
Sqoop Tutorial: Sqoop: "SQL To Hadoop and Hadoop To SQL"
11 pages
Step 1 - Install Oracle Java 8 On Ubuntu
No ratings yet
Step 1 - Install Oracle Java 8 On Ubuntu
7 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
8 pages
$ Sudo Apt-Get Install Oracle-Java8-Installer
No ratings yet
$ Sudo Apt-Get Install Oracle-Java8-Installer
4 pages
HADOOP 1.X Installation Steps On Ubuntu
No ratings yet
HADOOP 1.X Installation Steps On Ubuntu
3 pages
Andover Plain English Language Reference
100% (1)
Andover Plain English Language Reference
740 pages
Yet Another Haskell Tutorial + Haskell Wikibooks Printable Version
100% (5)
Yet Another Haskell Tutorial + Haskell Wikibooks Printable Version
477 pages
Chapter 4-Javascript
No ratings yet
Chapter 4-Javascript
119 pages
Revision Control: Legend For Revision Codes: AD: Addition CH: Change CR: Correction DL: Deletion
No ratings yet
Revision Control: Legend For Revision Codes: AD: Addition CH: Change CR: Correction DL: Deletion
118 pages
Web Development Long Term Notes
No ratings yet
Web Development Long Term Notes
114 pages
DFSRC00 Parm Details
No ratings yet
DFSRC00 Parm Details
3 pages
Ignou Java Programming Lab
No ratings yet
Ignou Java Programming Lab
12 pages
Java Questions
No ratings yet
Java Questions
9 pages
Readme - en Parcad
No ratings yet
Readme - en Parcad
6 pages
INTE 222 BBIT 314 COMP 226 COSF 311COMP 326 OBJECT ORIENTED PROGRAMMING WITH JAVA - Kabarak University
No ratings yet
INTE 222 BBIT 314 COMP 226 COSF 311COMP 326 OBJECT ORIENTED PROGRAMMING WITH JAVA - Kabarak University
5 pages
230809-4 - New Report On
No ratings yet
230809-4 - New Report On
31 pages
Java
No ratings yet
Java
73 pages
CV Template Tri Sutisna SQA 2 Weeks
No ratings yet
CV Template Tri Sutisna SQA 2 Weeks
4 pages
Speech Synthesis & Speech Recognition Using SAPI 5.1
No ratings yet
Speech Synthesis & Speech Recognition Using SAPI 5.1
18 pages
Sohail
No ratings yet
Sohail
49 pages
It Final Exam
No ratings yet
It Final Exam
2 pages
C++ - When To Use The Brace-Enclosed Initializer - Stack Overflow
100% (1)
C++ - When To Use The Brace-Enclosed Initializer - Stack Overflow
5 pages
Gopi Krishna - Hexaware
No ratings yet
Gopi Krishna - Hexaware
3 pages
Powerpoint
No ratings yet
Powerpoint
14 pages
(Lec-15) Java SE (Explicit, Initializers, Constructors)
No ratings yet
(Lec-15) Java SE (Explicit, Initializers, Constructors)
12 pages
How Does Hive Compare To HBase
No ratings yet
How Does Hive Compare To HBase
26 pages
OOP Debug 6
No ratings yet
OOP Debug 6
12 pages
Ch7-Project-Sachev Satheesh Kumar
No ratings yet
Ch7-Project-Sachev Satheesh Kumar
3 pages
Cst205 Oopj Dec 2022
No ratings yet
Cst205 Oopj Dec 2022
3 pages
Client Plusone
No ratings yet
Client Plusone
9 pages
18october2014 Walk in Drive BYB
No ratings yet
18october2014 Walk in Drive BYB
4 pages
Configuration of a Simple Samba File Server, Quota and Schedule Backup
From Everand
Configuration of a Simple Samba File Server, Quota and Schedule Backup
Dr. Hedaya Alasooly
No ratings yet

Hadoop Installation Guide

Uploaded by

Hadoop Installation Guide

Uploaded by

Hadoop Installation Guide https://kongu.edu/support/hadoop/index.

Step 1: Install Java Development Kit

sudo apt update && sudo apt install openjdk-8-jdk

Step 2: Verify Java Version

Step 3: Install SSH

sudo apt install ssh

Step 4: Create the Hadoop User

sudo adduser hadoop

Step 5: Switch User

Step 6: Configure SSH

Step 7: Set Permissions

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

chmod 640 ~/.ssh/authorized_keys

Step 8: SSH to the localhost

Step 9: Switch User

Switch to the 'hadoop' user again using the following command:

Step 10: Install Hadoop

To begin, download Hadoop version 3.3.6 using the 'wget' command:

tar -xvzf hadoop-3.3.6.tar.gz

Append the following lines to the file:

Load the above configuration into the current environment:

Search for the “export JAVA_HOME” and configure it .

Step 11: Configuring Hadoop

Change the NameNode and DataNode directory paths as shown below:

Make the following changes:

Finally, edit the 'yarn-site.xml' file:

Make the following changes:

Step 12: Start Hadoop Cluster

hdfs namenode -format

Step 13: Access Hadoop Namenode and Resource

First, determine your IP address by running:

If needed, install 'net-tools' using:

sudo apt install net-tools

Step 14: Verify the Hadoop Cluster

hdfs dfs -mkdir /test1

hdfs dfs -mkdir /logs

Next, run the following command to list the above directory:

hdfs dfs -ls /

You should get the following output:

hdfs dfs -put /var/log/* /logs/

Step 15: To Stop Hadoop Services

You might also like