0% found this document useful (0 votes)

57 views15 pages

How To Install Hadoop On Ubuntu 18.04 or 20.04

The document provides instructions for deploying Apache Hadoop in a single node or pseudo-distributed mode on Ubuntu. It covers installing Java, creating a hadoop user, configuring environment variables and Hadoop configuration files like core-site.xml and hdfs-site.xml to set up HDFS for a single node deployment.

Uploaded by

Javeria Zia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views15 pages

How To Install Hadoop On Ubuntu 18.04 or 20.04

Uploaded by

Javeria Zia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Introduction

Every major industry is implementing Apache Hadoop as the standard framework for processing and
storing big data. Hadoop is designed to be deployed across a network of hundreds or even
thousands of dedicated servers. All these machines work together to deal with the massive volume
and variety of incoming datasets.

Deploying Hadoop services on a single node is a great way to get yourself acquainted with basic
Hadoop commands and concepts.

Prerequisites

Access to a terminal window/command line

Sudo or root privileges on local /remote machines

Install OpenJDK on Ubuntu

The Hadoop framework is written in Java, and its services require a compatible Java Runtime
Environment (JRE) and Java Development Kit (JDK). Use the following command to update your
system before initiating a new installation:

sudo apt update

At the moment, Apache Hadoop 3.x fully supports Java 8. The OpenJDK 8 package in Ubuntu
contains both the runtime environment and development kit.

Type the following command in your terminal to install OpenJDK 8:

sudo apt install openjdk-8-jdk -y

The OpenJDK or Oracle Java version can affect how elements of a Hadoop ecosystem interact. To
install a specific Java version, check out our detailed guide on how to install Java on Ubuntu.

Once the installation process is complete, verify the current Java version:

java -version; javac -version

The output informs you which Java edition is in use.

Set Up a Non-Root User for Hadoop
Environment
It is advisable to create a non-root user, specifically for the Hadoop environment. A distinct user
improves security and helps you manage your cluster more efficiently. To ensure the smooth
functioning of Hadoop services, the user should have the ability to establish a passwordless SSH
connection with the localhost.

Install OpenSSH on Ubuntu

Install the OpenSSH server and client using the following command:

sudo apt install openssh-server openssh-client -y

In the example below, the output confirms that the latest version is already installed.

If you have installed OpenSSH for the first time, use this opportunity to implement these vital SSH
security recommendations.

Create Hadoop User

Utilize the adduser command to create a new Hadoop user:

sudo adduser hdoop

The username, in this example, is hdoop. You are free the use any username and password you see
fit. Switch to the newly created user and enter the corresponding password:

su - hdoop

The user now needs to be able to SSH to the localhost without being prompted for a password.
Enable Passwordless SSH for Hadoop User
Generate an SSH key pair and define the location is is to be stored in:

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

The system proceeds to generate and save the SSH key pair.

Use the cat command to store the public key as authorized_keys in the ssh directory:

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Set the permissions for your user with the chmod command:

chmod 0600 ~/.ssh/authorized_keys

The new user is now able to SSH without needing to enter a password every time. Verify everything
is set up correctly by using the hdoop user to SSH to localhost:

ssh localhost

After an initial prompt, the Hadoop user is now able to establish an SSH connection to the localhost
seamlessly.

Download and Install Hadoop on Ubuntu

Visit the official Apache Hadoop project page, and select the version of Hadoop you want to
implement.

The steps outlined in this tutorial use the Binary download for Hadoop Version 3.2.1.

Select your preferred option, and you are presented with a mirror link that allows you to download the
Hadoop tar package.

Note: It is sound practice to verify Hadoop downloads originating from mirror sites.

 The instructions for using GPG or SHA-512 for verification are provided on the
official download page.
Use the provided mirror link and download the Hadoop package with the wget command:

wget https://downloads.apache.org/hadoop/common/hadoop-3.2.1/hadoop-
3.2.1.tar.gz

Once the download is complete, extract the files to initiate the Hadoop installation:

tar xzf hadoop-3.2.1.tar.gz

The Hadoop binary files are now located within the hadoop-3.2.1 directory.

Single Node Hadoop Deployment (Pseudo-

Distributed Mode)
Hadoop excels when deployed in a fully distributed mode on a large cluster of networked servers.
However, if you are new to Hadoop and want to explore basic commands or test applications, you
can configure Hadoop on a single node.

This setup, also called pseudo-distributed mode, allows each Hadoop daemon to run as a single
Java process. A Hadoop environment is configured by editing a set of configuration files:

bashrc
hadoop-env.sh
core-site.xml
hdfs-site.xml
mapred-site-xml
yarn-site.xml

Configure Hadoop Environment Variables (bashrc)

Edit the .bashrc shell configuration file using a text editor of your choice (we will be using nano):
sudo nano .bashrc

Define the Hadoop environment variables by adding the following content to the end of the file:

#Hadoop Related Options

export HADOOP_HOME=/home/hdoop/hadoop-3.2.1
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_OPTS"-Djava.library.path=$HADOOP_HOME/lib/nativ"

Once you add the variables, save and exit the .bashrc file.

It is vital to apply the changes to the current running environment by using the following command:

source ~/.bashrc

Edit hadoop-env.sh File

The hadoop-env.sh file serves as a master file to configure YARN, HDFS, MapReduce, and Hadoop-
related project settings.

When setting up a single node Hadoop cluster, you need to define which Java implementation is to
be utilized. Use the previously created $HADOOP_HOME variable to access the hadoop-env.sh file:

sudo nano $HADOOP_HOME/etc/hadoop/hadoop-env.sh

Uncomment the $JAVA_HOME variable (i.e., remove the # sign) and add the full path to the OpenJDK
installation on your system. If you have installed the same version as presented in the first part of
this tutorial, add the following line:

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

The path needs to match the location of the Java installation on your system.

If you need help to locate the correct Java path, run the following command in your terminal window:

which javac

The resulting output provides the path to the Java binary directory.
Use the provided path to find the OpenJDK directory with the following command:

readlink -f /usr/bin/javac

The section of the path just before the /bin/javac directory needs to be assigned to the $JAVA_HOME
variable.

Edit core-site.xml File

The core-site.xml file defines HDFS and Hadoop core properties.

To set up Hadoop in a pseudo-distributed mode, you need to specify the URL for your NameNode,
and the temporary directory Hadoop uses for the map and reduce process.

Open the core-site.xml file in a text editor:

sudo nano $HADOOP_HOME/etc/hadoop/core-site.xml

Add the following configuration to override the default values for the temporary directory and add
your HDFS URL to replace the default local file system setting:

<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hdoop/tmpdata</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://127.0.0.1:9000</value>
</property>
</configuration>

This example uses values specific to the local system. You should use values that match your
systems requirements. The data needs to be consistent throughout the configuration process.
Do not forget to create a Linux directory in the location you specified for your temporary data.

Edit hdfs-site.xml File

The properties in the hdfs-site.xml file govern the location for storing node metadata, fsimage file,
and edit log file. Configure the file by defining the NameNode and DataNode storage directories.

Additionally, the default dfs.replication value of 3 needs to be changed to 1 to match the single
node setup.

Use the following command to open the hdfs-site.xml file for editing:

sudo nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml

Add the following configuration to the file and, if needed, adjust the NameNode and DataNode
directories to your custom locations:

<configuration>
<property>
<name>dfs.data.dir</name>
<value>/home/hdoop/dfsdata/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hdoop/dfsdata/datanode</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

If necessary, create the specific directories you defined for the dfs.data.dir value.

Edit mapred-site.xml File

Use the following command to access the mapred-site.xml file and define MapReduce values:

sudo nano $HADOOP_HOME/etc/hadoop/mapred-site.xml

Add the following configuration to change the default MapReduce framework name value to yarn:

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Edit yarn-site.xml File
The yarn-site.xml file is used to define settings relevant to YARN. It contains configurations for the
Node Manager, Resource Manager, Containers, and Application Master.

Open the yarn-site.xml file in a text editor:

sudo nano $HADOOP_HOME/etc/hadoop/yarn-site.xml

Append the following configuration to the file:

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>127.0.0.1</value>
</property>
<property>
<name>yarn.acl.enable</name>
<value>0</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DI
R,CLASSPATH_PERPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</va
lue>
</property>
</configuration>

Format HDFS NameNode

It is important to format the NameNode before starting Hadoop services for the first time:

hdfs namenode -format

The shutdown notification signifies the end of the NameNode format process.

Start Hadoop Cluster

Navigate to the hadoop-3.2.1/sbin directory and execute the following commands to start the
NameNode and DataNode:
./start-dfs.sh

The system takes a few moments to initiate the necessary nodes.

Once the namenode, datanodes, and secondary namenode are up and running, start the YARN
resource and nodemanagers by typing:

./start-yarn.sh

As with the previous command, the output informs you that the processes are starting.

Type this simple command to check if all the daemons are active and running as Java processes:

jps

If everything is working as intended, the resulting list of running Java processes contains all the
HDFS and YARN daemons.

Access Hadoop UI from Browser

Use your preferred browser and navigate to your localhost URL or IP. The default port number 9870
gives you access to the Hadoop NameNode UI:

http://localhost:9870

The NameNode user interface provides a comprehensive overview of the entire cluster.
The default port 9864 is used to access individual DataNodes directly from your browser:

http://localhost:9864

The YARN Resource Manager is accessible on port 8088:

http://localhost:8088

The Resource Manager is an invaluable tool that allows you to monitor all running processes in your
Hadoop cluster.
Conclusion

You have successfully installed Hadoop on Ubuntu and deployed it in a pseudo-distributed mode. A
single node Hadoop deployment is an excellent starting point to explore basic HDFS commands and
acquire the experience you need to design a fully distributed Hadoop cluster.

Google Hacking Database
79% (19)
Google Hacking Database
91 pages
Hella Methods Here
82% (17)
Hella Methods Here
17 pages
300+ Powerful Termux Hacking Tools For Hackers
89% (27)
300+ Powerful Termux Hacking Tools For Hackers
16 pages
SCARY ROBOT TOOLS! - Affiliates
100% (1)
SCARY ROBOT TOOLS! - Affiliates
4 pages
How To Hack Gmail Using Termux
80% (10)
How To Hack Gmail Using Termux
29 pages
1 K
80% (15)
1 K
17 pages
Practicals of Mantras and Tantras
97% (36)
Practicals of Mantras and Tantras
292 pages
My Favorate Hacking Sites
90% (29)
My Favorate Hacking Sites
3 pages
Ethical Hacking With Android Termux 2021
91% (11)
Ethical Hacking With Android Termux 2021
157 pages
Konta Valorant Od Hyperka
No ratings yet
Konta Valorant Od Hyperka
35 pages
Witherby Seamanship2014 PDF
67% (3)
Witherby Seamanship2014 PDF
92 pages
100+ Courses Bundle
No ratings yet
100+ Courses Bundle
8 pages
Installation of Hadoop in Ubuntu
No ratings yet
Installation of Hadoop in Ubuntu
15 pages
Hadoop Install
No ratings yet
Hadoop Install
19 pages
Bdamanual
No ratings yet
Bdamanual
8 pages
Experiment No - 1
No ratings yet
Experiment No - 1
13 pages
BDA Practical
No ratings yet
BDA Practical
38 pages
Hadoop Installation
No ratings yet
Hadoop Installation
7 pages
Anurag 1-6 Merged
No ratings yet
Anurag 1-6 Merged
60 pages
Install Hadoop
No ratings yet
Install Hadoop
8 pages
How To Install Hadoop On Ubuntu 18.04 or 20.04
No ratings yet
How To Install Hadoop On Ubuntu 18.04 or 20.04
20 pages
Hadoop Installation Manual 2.odt
No ratings yet
Hadoop Installation Manual 2.odt
20 pages
Hadoop Installation Step by Step
No ratings yet
Hadoop Installation Step by Step
8 pages
Hadoop 3 Installation
No ratings yet
Hadoop 3 Installation
10 pages
$ Sudo Apt-Get Install Oracle-Java8-Installer
No ratings yet
$ Sudo Apt-Get Install Oracle-Java8-Installer
4 pages
Aryan
No ratings yet
Aryan
60 pages
Experiment 1 Hadoop Installation
No ratings yet
Experiment 1 Hadoop Installation
6 pages
BDAO
No ratings yet
BDAO
23 pages
Hbase Installationn
No ratings yet
Hbase Installationn
12 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
45 pages
2023MCS320004 HEMANTH TARRA - Hadoop Installation - Assignment
No ratings yet
2023MCS320004 HEMANTH TARRA - Hadoop Installation - Assignment
9 pages
Online:: Setting Up The Environment
No ratings yet
Online:: Setting Up The Environment
9 pages
Step 1 - Install Oracle Java 8 On Ubuntu
No ratings yet
Step 1 - Install Oracle Java 8 On Ubuntu
7 pages
Bda Lab
No ratings yet
Bda Lab
37 pages
Hadoop 2 - Pseudo Node Installation
No ratings yet
Hadoop 2 - Pseudo Node Installation
9 pages
Experiment-2 BDA Lab
No ratings yet
Experiment-2 BDA Lab
13 pages
Single Node Hadoop Cluster
No ratings yet
Single Node Hadoop Cluster
9 pages
A Report On Distributed Computing
No ratings yet
A Report On Distributed Computing
25 pages
Running Ha Do Op Michel Noll
No ratings yet
Running Ha Do Op Michel Noll
23 pages
213nt1306 - Big Data Analytics Lab Manual
No ratings yet
213nt1306 - Big Data Analytics Lab Manual
80 pages
BDA LAB Programs
No ratings yet
BDA LAB Programs
56 pages
Hadoop 2.7.3 Setup On Ubuntu 15.10
No ratings yet
Hadoop 2.7.3 Setup On Ubuntu 15.10
7 pages
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
27 pages
Hadoop Cluster Creation
No ratings yet
Hadoop Cluster Creation
8 pages
How To Install Hadoop On Ubuntu 18
No ratings yet
How To Install Hadoop On Ubuntu 18
15 pages
Setup 7
No ratings yet
Setup 7
11 pages
DataVisuaization Lab
No ratings yet
DataVisuaization Lab
5 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
8 pages
Installationof Hadoop 3
No ratings yet
Installationof Hadoop 3
6 pages
Hadoop Installatio1
No ratings yet
Hadoop Installatio1
22 pages
Installing A Single Node Hadoop Cluster
No ratings yet
Installing A Single Node Hadoop Cluster
4 pages
Hadoop Installation
No ratings yet
Hadoop Installation
7 pages
Lab Manual
No ratings yet
Lab Manual
27 pages
Hadoop Installation
No ratings yet
Hadoop Installation
6 pages
BDA Lab Manual UPDATED
No ratings yet
BDA Lab Manual UPDATED
45 pages
BD Lab File
No ratings yet
BD Lab File
39 pages
Big Data Analytics Lab Experiments
No ratings yet
Big Data Analytics Lab Experiments
16 pages
Hadoop Installation Steps
100% (1)
Hadoop Installation Steps
6 pages
Big Data File
No ratings yet
Big Data File
32 pages
Hadoop Installaion
No ratings yet
Hadoop Installaion
113 pages
Big Data Analytics - Lab-Manual
No ratings yet
Big Data Analytics - Lab-Manual
19 pages
Hadoop Installation Step by Step
No ratings yet
Hadoop Installation Step by Step
6 pages
Unix Commands Part 2
No ratings yet
Unix Commands Part 2
37 pages
Hive INstallation
No ratings yet
Hive INstallation
13 pages
Hadoop Installation Guide
No ratings yet
Hadoop Installation Guide
18 pages
BDA Lab Manual-1
No ratings yet
BDA Lab Manual-1
60 pages
Hadoop Configuration
No ratings yet
Hadoop Configuration
12 pages
BDA Unit-4
No ratings yet
BDA Unit-4
38 pages
Hadoop Installation Guide
No ratings yet
Hadoop Installation Guide
18 pages
Installing Hadoop On Ubuntu
No ratings yet
Installing Hadoop On Ubuntu
29 pages
Big Data
No ratings yet
Big Data
32 pages
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
From Everand
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
Dr. Hidaia Mahmood Alassouli
No ratings yet
Play Dice Casino Game Online - Dice Betting at Stake Casino
50% (2)
Play Dice Casino Game Online - Dice Betting at Stake Casino
1 page
Huda Natural Leaders
No ratings yet
Huda Natural Leaders
7 pages
Skillset For Tech
No ratings yet
Skillset For Tech
29 pages
【BitTiger独家】秋招Top 100必投公司开放职位列表
No ratings yet
【BitTiger独家】秋招Top 100必投公司开放职位列表
2 pages
Ancient Indian Architecture
No ratings yet
Ancient Indian Architecture
19 pages
Concept of Vaastu
No ratings yet
Concept of Vaastu
4 pages
Happy New Year Script.2
No ratings yet
Happy New Year Script.2
21 pages
Chapter-1 Vedic
No ratings yet
Chapter-1 Vedic
33 pages
The Web-Based Database Management System For The Computer Science
No ratings yet
The Web-Based Database Management System For The Computer Science
94 pages
Vedic Homes
100% (1)
Vedic Homes
23 pages
All Passwords
100% (3)
All Passwords
23 pages
Full Summary of Scientific Research PDF
No ratings yet
Full Summary of Scientific Research PDF
7 pages
Lab12 Cao Duc Anh He180550 Iam302
No ratings yet
Lab12 Cao Duc Anh He180550 Iam302
14 pages
Endpoint Security Summary
No ratings yet
Endpoint Security Summary
32 pages
My Methods
83% (6)
My Methods
36 pages
Vastu Shastra of Indian Residential Buildings in Context With The Environment
No ratings yet
Vastu Shastra of Indian Residential Buildings in Context With The Environment
8 pages
Library Catalog1
0% (1)
Library Catalog1
342 pages
Yu Cheats Bot Nitro Type
No ratings yet
Yu Cheats Bot Nitro Type
32 pages
20 Lead Genaraation Linkdin
No ratings yet
20 Lead Genaraation Linkdin
7 pages
Marine Hsse Final Assignment Chop Saw
No ratings yet
Marine Hsse Final Assignment Chop Saw
11 pages
Module 3.1 - Training Certificate - Folayeni - Awosika
No ratings yet
Module 3.1 - Training Certificate - Folayeni - Awosika
1 page
Industry, Commerce, Trade
100% (2)
Industry, Commerce, Trade
9 pages
1Y0-204 Dumps Citrix Virtual Apps and Desktops 7 Administration
No ratings yet
1Y0-204 Dumps Citrix Virtual Apps and Desktops 7 Administration
7 pages
IDFL Standards - European Sleeping Bag Labeling Info EN13537 Information For Consumers Jan 05
No ratings yet
IDFL Standards - European Sleeping Bag Labeling Info EN13537 Information For Consumers Jan 05
5 pages
Notes
No ratings yet
Notes
6 pages
RC1665 - Mindi Puspita Anggraeni
No ratings yet
RC1665 - Mindi Puspita Anggraeni
5 pages
Acc Tutorial Topic 8
No ratings yet
Acc Tutorial Topic 8
9 pages
MS015 User Manual Multi
No ratings yet
MS015 User Manual Multi
90 pages
Diagrama Esmeriladora Dewalt Dwe4120 b3 PDF
No ratings yet
Diagrama Esmeriladora Dewalt Dwe4120 b3 PDF
4 pages
Case Application 1-b
No ratings yet
Case Application 1-b
2 pages
Volume 5-2 (C) - ESIA For Padibe West
No ratings yet
Volume 5-2 (C) - ESIA For Padibe West
288 pages
How To Write A Biology Literature Review Paper
100% (2)
How To Write A Biology Literature Review Paper
7 pages
Otondro Prohori, Guarding Who, Against What
No ratings yet
Otondro Prohori, Guarding Who, Against What
10 pages
2550Q-4th2021 - (EB187139-EEFB-462C
No ratings yet
2550Q-4th2021 - (EB187139-EEFB-462C
3 pages
Quality Practices and Problems in Free Software Projects: Martin Michlmayr, Francis Hunt, David Probert
No ratings yet
Quality Practices and Problems in Free Software Projects: Martin Michlmayr, Francis Hunt, David Probert
5 pages
Stiffened Round
100% (1)
Stiffened Round
16 pages
Kolkata Faculty List DG Upload Jan 2023
No ratings yet
Kolkata Faculty List DG Upload Jan 2023
3 pages
Business Communication Report
No ratings yet
Business Communication Report
15 pages
Missing Persons - Newspaper
No ratings yet
Missing Persons - Newspaper
2 pages
Selfstudys Com File
No ratings yet
Selfstudys Com File
18 pages
CSE Lecture02.note
No ratings yet
CSE Lecture02.note
22 pages
TM Series Data Sheet 1
No ratings yet
TM Series Data Sheet 1
2 pages
Venture Capital
No ratings yet
Venture Capital
16 pages
LP - ARTS 2nd Quarter
No ratings yet
LP - ARTS 2nd Quarter
7 pages
X U Data Sheet Technical Information ASSET DOC 2597808
No ratings yet
X U Data Sheet Technical Information ASSET DOC 2597808
10 pages
STO Process - Pricing Procedure
No ratings yet
STO Process - Pricing Procedure
30 pages
Obj To Report of No Distribution (Original As Filed)
No ratings yet
Obj To Report of No Distribution (Original As Filed)
10 pages
2500 Level-Trol Controller
No ratings yet
2500 Level-Trol Controller
6 pages

How To Install Hadoop On Ubuntu 18.04 or 20.04

Uploaded by

How To Install Hadoop On Ubuntu 18.04 or 20.04

Uploaded by

Introduction

Access to a terminal window/command line

Install OpenJDK on Ubuntu

sudo apt update

Type the following command in your terminal to install OpenJDK 8:

sudo apt install openjdk-8-jdk -y

java -version; javac -version

The output informs you which Java edition is in use.

Install OpenSSH on Ubuntu

sudo apt install openssh-server openssh-client -y

Create Hadoop User

sudo adduser hdoop

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

chmod 0600 ~/.ssh/authorized_keys

Download and Install Hadoop on Ubuntu

tar xzf hadoop-3.2.1.tar.gz

Single Node Hadoop Deployment (Pseudo-

Configure Hadoop Environment Variables (bashrc)

#Hadoop Related Options

Edit hadoop-env.sh File

sudo nano $HADOOP_HOME/etc/hadoop/hadoop-env.sh

Edit core-site.xml File

Open the core-site.xml file in a text editor:

sudo nano $HADOOP_HOME/etc/hadoop/core-site.xml

Edit hdfs-site.xml File

sudo nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml

Edit mapred-site.xml File

sudo nano $HADOOP_HOME/etc/hadoop/mapred-site.xml

Open the yarn-site.xml file in a text editor:

sudo nano $HADOOP_HOME/etc/hadoop/yarn-site.xml

Append the following configuration to the file:

Format HDFS NameNode

hdfs namenode -format

Start Hadoop Cluster

The system takes a few moments to initiate the necessary nodes.

Access Hadoop UI from Browser

The YARN Resource Manager is accessible on port 8088:

You might also like