0% found this document useful (0 votes)

139 views20 pages

Hadoop Installation Manual 2.odt

The document outlines the 7 steps to install Hadoop on Ubuntu 16.04/18.04: 1. Install prerequisites like Java and SSH. 2. Create a dedicated Hadoop user and groups. 3. Download and extract the Hadoop source files. 4. Configure environment files like bashrc and Hadoop configuration files. 5. Format the HDFS file system. 6. Start the Hadoop daemons using start-all.sh. 7. Run a sample MapReduce job to test the Hadoop installation.

Uploaded by

Gurasees Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as ODT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

139 views20 pages

Hadoop Installation Manual 2.odt

Uploaded by

Gurasees Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as ODT, PDF, TXT or read online on Scribd

You are on page 1/ 20

Hadoop Implementation Steps on Ubuntu

16.-04/18.04 Linux

(COMPUTER SCIENCE AND ENGINEERING)

ADITYA BHARDWAJ

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

PEC

SECTOR – 12, CHANDIGARH, INDIA

2019
Step 1 – Prerequsities
Before beginning the installation run login shell as the sudo user and
update the current packages installed. Lets my ubuntu host name is
server3

sudo apt update

OpenJDK 8

Java 8 is the current Long Term Support version and is still widely supported, though
public maintenance ends in January 2019. To install OpenJDK 8, execute the following
command:

root@server3: sudo apt install openjdk-8-jdk

Verify that this is installed with

root@server3: java -version

You'll see output like this:

Output
openjdk version "1.8.0_162"
OpenJDK Runtime Environment (build 1.8.0_162-8u162-b12-1-b12)
OpenJDK 64-Bit Server VM (build 25.162-b12, mixed mode)
You have successfully installed Java 11 on Ubuntu 16.04 LTS system.

root@server3:readlink -f /usr/bin/java | sed "s:bin/java::"

root@server3:sudo gedit /etc/environment
Following configuration are done in environment file

JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64/
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin

export JAVA_HOME

export PATH

Verify that the environment variable is set:

root@server3: echo $JAVA_HOME

Step 2 – Create User for Haddop

Hit CTRL+ALT+T to get started. We will install Hadoop from the terminal. For new Linux
users, things might get confusing while installing different programs and managing them from
the same login. If you are one of them, we have a solution. Let’s create a new dedicated Hadoop
user. Whenever you want to use Hadoop, just use the separate login. Simple.

$ sudo addgroup hadoop

$ sudo adduser –ingroup hadoop hduser
Note: You just enter Unix user name pwd and for other Just hit enter and press ‘y’ at the end.
Add Hadoop user to sudo group (Basically, grant it all permissions)

server1@server3: sudo adduser hduser sudo

Install SSH

root@server3: sudo apt-get install ssh

Passwordless entry for localhost using SSH

root@server3: su -hduser
hduser@server3: sudo ssh-keygen -t rsa
hduser@server3: ssh-keygen -t rsa
Note: When ask for file name or location, leave it blank.
hduser@server3: cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
hduser@server3: chmod 0600 ~/.ssh/authorized_keys
Figure: SSH Key generation

Check if ssh works,

$ ssh localhost
Figure: hduser permission

Once we are logged in localhost, exit from this session using following command.

$ exit
Step 3 – Download Hadoop Source Archive
In this step, download hadoop 3.1 source archive file using below
command. You can also select alternate download mirror for increasing
download speed.

cd ~

server1@server3: wget http://www-eu.apache.org/dist/hadoop/common/hadoop-

3.1.2/hadoop-3.1.2.tar.gz

server1@server3: tar xzf hadoop-3.1.2.tar.gz

3.2 Hadoop Configuration

Make a directory called hadoop from the hduser and move the folder ‘hadoop-3.1.2’ to this
directory

server1@server3: sudo mkdir -p /usr/local/hadoop

server1@server3: cd hadoop-3.1.2/
server1@server3: sudo mv * /usr/local/hadoop
server1@server3: sudo chown -R hduser:hadoop /usr/local/hadoop
STEP 4 – Setting up Configuration files
We will change content of following files in order to complete hadoop installation.
1. ~/.bashrc
2. hadoop-env.sh
3. core-site.xml
4. hdfs-site.xml
5. yarn-site.xml
Details:
 hadoop-env.sh – This file contains some environment variable settings used by Hadoop.
You can use these to affect some aspects of Hadoop daemon behavior, such as where log
files are stored, the maximum amount of heap used etc. The only variable you should
need to change at this point is in this file is JAVA_HOME, which specifies the path to the
Java 1.7.x installation used by Hadoop.
 core-site.xml – key property fs.default.name – for namenode configuration for
e.g hdfs://namenode/. Namenode is the node which stores the filesystem metadata i.e.
which file maps to what block locations and which blocks are stored on which datanode
 hdfs-site.xml – key property – dfs.replication – by default 3
 mapred-site.xml – key property mapred.job.tracker for jobtracker configuration for
e.g jobtracker:8021
 yarn-site.xml: resource management
4.1 ~/.bashrc

If you don’t know the path where java is installed, first run the following command to locate it
root@server3:readlink -f /usr/bin/java | sed "s:bin/java::"

Now open the ~/.bashrc file

hduser@server3:~$ sudo gedit ~/.bashrc

#HADOOP VARIABLES START

export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64

export HADOOP_HOME=/usr/local/hadoop

export PATH=$PATH:$HADOOP_HOME/bin

export PATH=$PATH:$HADOOP_HOME/sbin

export HADOOP_MAPRED_HOME=$HADOOP_HOME

export HADOOP_COMMON_HOME=$HADOOP_HOME

export HADOOP_HDFS_HOME=$HADOOP_HOME

export YARN_HOME=$HADOOP_HOME

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

#HADOOP VARIABLES END

Update .bashrc file to apply changes
$source ~/.bashrc

4.2 hadoop-env.sh
We need to tell Hadoop the path where java is installed. That’s what we will do in this file,
specify the path for JAVA_HOME variable.
Open the file,
hduser@server3:~$ sudo gedit /usr/local/hadoop/etc/hadoop/hadoop-env.sh

Now, the first variable in file will be JAVA_HOME variable, change the value of that variable to
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64

4.3 core-site.xml
Create temporary directory

hduser@server3 :~$ sudo mkdir -p /app/hadoop/tmp

hduser@server3 :~$ sudo chown hduser:hadoop /app/hadoop/tmp
open the file
hduser@server3 :~$ sudo gedit /usr/local/hadoop/etc/hadoop/core-site.xml

Append the following between configuration tags. Same as

below.
<property>

<name>hadoop.tmp.dir</name>

<value>/app/hadoop/tmp</value>

<description>A base for other temporary directories.</description>

</property>

<name>fs.default.name</name>

<value>hdfs://localhost:54310</value>

<description>The name of the default file system. A URI whose scheme and authority
determine the FileSystem implementation. The uri’s scheme determines the config property
(fs.SCHEME.impl) naming the FileSystem implementation class. The uri’s authority is used to
determine the host, port, etc. for a filesystem.</description>

</property>
4.4 hdfs-site.xml
Mainly there are two directories,
1. Name Node
2. Data Node
Make directories

hduser@server3 sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode

hduser@server3 sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode

hduser@server3 sudo chown -R hduser:hadoop /usr/local/hadoop_store

Open the file,

hduser@server3 sudo gedit /usr/local/hadoop/etc/hadoop/hdfs-site.xml

Change the content between configuration tags shown as below.

<name>dfs.replication</name>

<description>Default block replication.The actual number of replications can be specified when

the file is created. The default is used if replication is not specified in create time.

</description>

</property>

<name>dfs.namenode.name.dir</name>

<value>file:/usr/local/hadoop_store/hdfs/namenode</value>

</property>

<name>dfs.datanode.data.dir</name>

<value>file:/usr/local/hadoop_store/hdfs/datanode</value>

</property>
4.5 yarn-site.xml
Open the file,

hduser@server3 :~$ sudo gedit /usr/local/hadoop/etc/hadoop/yarn-site.xml

Just like the other two, add the content to configuration tags.

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>
STEP 5- Format Hadoop file system
Hadoop installation is now done. All we have to do is change format the name-nodes before
using it.

hduser@server3 :~$ hadoop namenode -format

STEP 6- Start Hadoop daemons
Now that hadoop installation is complete and name-nodes are formatted, we can start hadoop by
going to following directory.

$ cd /usr/local/hadoop/sbin

$ start-all.sh

Just check if all daemons are properly started using the following command:

$ jps

STEP 7 – IF you want to Stop Hadoop daemons

Step 7 of hadoop installation is when you need to stop Hadoop and all its modules.

$ stop-all.sh
Appreciate yourself because you’ve done it. You have completed all the Hadoop installation
steps and Hadoop is now ready to run the first program.
Let’s run MapReduce job on our entirely fresh
Hadoop cluster setup
Go to the following directory

$ cd /usr/local/hadoop
Run the following command

hduser@server3 :/usr/local/hadoop$ hadoop jar ./share/hadoop/mapreduce/hadoop-

mapreduce-examples-3.1.2.jar pi 10 100
userdel hadoop Command to delete hadoop user name

Anatomy and Pathophysiology of Anemia
88% (8)
Anatomy and Pathophysiology of Anemia
9 pages
BDAO
No ratings yet
BDAO
23 pages
Sujatha Hadoop Admin
No ratings yet
Sujatha Hadoop Admin
5 pages
Introduction To Ubuntu PDF
100% (1)
Introduction To Ubuntu PDF
41 pages
Step To Sample Book
100% (1)
Step To Sample Book
120 pages
Reconceptualizing Confucian Philosophy in The 21st Century 1st Edition Xinzhong Yao (Eds.) Download
100% (2)
Reconceptualizing Confucian Philosophy in The 21st Century 1st Edition Xinzhong Yao (Eds.) Download
56 pages
Bdamanual
No ratings yet
Bdamanual
8 pages
Management Accounting 2marks Solved (2014-2021)
No ratings yet
Management Accounting 2marks Solved (2014-2021)
12 pages
Hadoop Configuration
No ratings yet
Hadoop Configuration
12 pages
Hadoop 2x Installation With HA (NFS - QJM)
No ratings yet
Hadoop 2x Installation With HA (NFS - QJM)
43 pages
Experiment No - 1
No ratings yet
Experiment No - 1
13 pages
Hadoop Installation Step by Step
No ratings yet
Hadoop Installation Step by Step
6 pages
Install Hadoop
No ratings yet
Install Hadoop
8 pages
TOPIC 01 - INTRO TO ARCHITECTURAL DESIGN 5 - Snotes
No ratings yet
TOPIC 01 - INTRO TO ARCHITECTURAL DESIGN 5 - Snotes
26 pages
C Series Product Guide PDF
No ratings yet
C Series Product Guide PDF
112 pages
HTML Element Question and Answer
No ratings yet
HTML Element Question and Answer
6 pages
Multi Node Cluster Installation Guide PDF
No ratings yet
Multi Node Cluster Installation Guide PDF
24 pages
Oracle Tutorial
No ratings yet
Oracle Tutorial
119 pages
ECommerce Proposal Template
No ratings yet
ECommerce Proposal Template
7 pages
Secondary Market DR S Sreenivasa Murthy
No ratings yet
Secondary Market DR S Sreenivasa Murthy
33 pages
Unix & Shell Programming Manual
100% (1)
Unix & Shell Programming Manual
48 pages
Tycs Ai Unit 2
No ratings yet
Tycs Ai Unit 2
84 pages
DataStage Adminguide
0% (1)
DataStage Adminguide
40 pages
Treesize Professional: Test. Buy. Enjoy
No ratings yet
Treesize Professional: Test. Buy. Enjoy
171 pages
News Writers Handbook by Rob Melton
No ratings yet
News Writers Handbook by Rob Melton
116 pages
Unicenter AutoSys 4.5 Job Management For Windows User Guide
No ratings yet
Unicenter AutoSys 4.5 Job Management For Windows User Guide
511 pages
Linux Servers
No ratings yet
Linux Servers
104 pages
TNPSC Group 2 Mains Preparation Book List For Latest Updated Syllabus - TNPSC Group 4, VAO, Group 2, Group 1, Notificati 1
No ratings yet
TNPSC Group 2 Mains Preparation Book List For Latest Updated Syllabus - TNPSC Group 4, VAO, Group 2, Group 1, Notificati 1
5 pages
SA 7.50 Administration Guide
No ratings yet
SA 7.50 Administration Guide
320 pages
GoogleSecrets TipsTricks PDF
No ratings yet
GoogleSecrets TipsTricks PDF
75 pages
Telecom Domain
No ratings yet
Telecom Domain
5 pages
Autotask Implementation Plan (NEW UI) PDF
No ratings yet
Autotask Implementation Plan (NEW UI) PDF
4 pages
Hibernate (An ORM Tool)
No ratings yet
Hibernate (An ORM Tool)
69 pages
Android
No ratings yet
Android
262 pages
Anticipatory Bail: Submitted To Dr. Asad Malik
No ratings yet
Anticipatory Bail: Submitted To Dr. Asad Malik
37 pages
HTML Templates
No ratings yet
HTML Templates
31 pages
How To Run Datastage Job From Unix Command Line
No ratings yet
How To Run Datastage Job From Unix Command Line
5 pages
It6713 Grid Cloud Computing Lab
No ratings yet
It6713 Grid Cloud Computing Lab
96 pages
Bluetooth Wireless Technology: An: Datapro Summary
No ratings yet
Bluetooth Wireless Technology: An: Datapro Summary
15 pages
Bar Exam Tips
No ratings yet
Bar Exam Tips
2 pages
HTML Notes by Subba Raj Sir PDF
No ratings yet
HTML Notes by Subba Raj Sir PDF
208 pages
Traing On Hadoop
No ratings yet
Traing On Hadoop
123 pages
CH 1 AngularJS
No ratings yet
CH 1 AngularJS
41 pages
Beyond The ESM Administrator Guide
No ratings yet
Beyond The ESM Administrator Guide
57 pages
Websense v80 Web Administrator Help
No ratings yet
Websense v80 Web Administrator Help
500 pages
BDE ManagedHadoopDataLakes PAVLIK PDF
No ratings yet
BDE ManagedHadoopDataLakes PAVLIK PDF
10 pages
Pytest Pres
No ratings yet
Pytest Pres
51 pages
Solaris Commands
No ratings yet
Solaris Commands
14 pages
F - S D (/ P) : Ictor OSE
No ratings yet
F - S D (/ P) : Ictor OSE
2 pages
Hadoop Administration Course Content PDF
No ratings yet
Hadoop Administration Course Content PDF
4 pages
Well House Consultants Samples Notes From Well House Consultants 1
100% (1)
Well House Consultants Samples Notes From Well House Consultants 1
24 pages
Ept Reviewer With Answers
No ratings yet
Ept Reviewer With Answers
24 pages
Hadoop Online Training
No ratings yet
Hadoop Online Training
7 pages
What Is The UK Spouse Visa 2023
No ratings yet
What Is The UK Spouse Visa 2023
6 pages
Web Services Testing
No ratings yet
Web Services Testing
38 pages
Form Management Admin Guide
No ratings yet
Form Management Admin Guide
10 pages
Informatica Installation Guide
No ratings yet
Informatica Installation Guide
26 pages
IBM WAS V8-Installation Guideline
No ratings yet
IBM WAS V8-Installation Guideline
36 pages
Booting&Modules 8
100% (1)
Booting&Modules 8
5 pages
Android Training Online
No ratings yet
Android Training Online
6 pages
Android Notes
No ratings yet
Android Notes
24 pages
How To Set Up A Mail Server On A GNU Linux System
No ratings yet
How To Set Up A Mail Server On A GNU Linux System
35 pages
Step by Step Installation
No ratings yet
Step by Step Installation
28 pages
SC-Mineral Processing Notes
No ratings yet
SC-Mineral Processing Notes
8 pages
Medical Technology Laws and Bioethics: Allyson F. Higoy, RMT
No ratings yet
Medical Technology Laws and Bioethics: Allyson F. Higoy, RMT
62 pages
A Review and Research Towards Cloud Computing
No ratings yet
A Review and Research Towards Cloud Computing
3 pages
Unix/Linux Command Reference: File Commands System Info
No ratings yet
Unix/Linux Command Reference: File Commands System Info
5 pages
The Top Five Myths of Website Security: A Focus On Small-to-Mid-Sized Enterprises
No ratings yet
The Top Five Myths of Website Security: A Focus On Small-to-Mid-Sized Enterprises
9 pages
Create Bootable USB Pen Drive For Windows 7 - Easy Tips & Tricks PDF
No ratings yet
Create Bootable USB Pen Drive For Windows 7 - Easy Tips & Tricks PDF
3 pages
Certificate Authority (CA) : Truststore Directory Structure
No ratings yet
Certificate Authority (CA) : Truststore Directory Structure
2 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
68 pages
VB5 Cracking With SmartCheck 5.0
No ratings yet
VB5 Cracking With SmartCheck 5.0
5 pages
WWW Fullsoftware4u Com Windows 8-1-32 Bit 64 Bit Download FR
No ratings yet
WWW Fullsoftware4u Com Windows 8-1-32 Bit 64 Bit Download FR
9 pages
Algebra P4
No ratings yet
Algebra P4
95 pages
Cloze Test: How To Crack The Nut
No ratings yet
Cloze Test: How To Crack The Nut
4 pages
w9 - L2 - Review For Lecture Midterm 2
No ratings yet
w9 - L2 - Review For Lecture Midterm 2
14 pages
Financial Integrity
No ratings yet
Financial Integrity
29 pages
Product Key of Tune Up Utilities 2012 - Softwares Zone
No ratings yet
Product Key of Tune Up Utilities 2012 - Softwares Zone
4 pages
Orientation - Induction Agenda APR25 POTSDAM
No ratings yet
Orientation - Induction Agenda APR25 POTSDAM
4 pages
Michael Jackson Dissertation
100% (2)
Michael Jackson Dissertation
4 pages
Oral Cancer Essay
No ratings yet
Oral Cancer Essay
3 pages
Chöông 4: Phaân Tích Cuù Phaùp
No ratings yet
Chöông 4: Phaân Tích Cuù Phaùp
54 pages
Grading Criteria - Business Plan Presentation
No ratings yet
Grading Criteria - Business Plan Presentation
3 pages
Reading Passage 1
No ratings yet
Reading Passage 1
13 pages
Rishi Sunak's Five Promises What Progress Has He Made
No ratings yet
Rishi Sunak's Five Promises What Progress Has He Made
5 pages
UPCAT Tracker
No ratings yet
UPCAT Tracker
1 page
Draft Accumulated Operating Surplus FINAL DRAFT
100% (1)
Draft Accumulated Operating Surplus FINAL DRAFT
3 pages
Iqbal New
No ratings yet
Iqbal New
1 page
Week 15 Finals
No ratings yet
Week 15 Finals
2 pages

Hadoop Installation Manual 2.odt

Uploaded by

Hadoop Installation Manual 2.odt

Uploaded by

Hadoop Implementation Steps on Ubuntu

(COMPUTER SCIENCE AND ENGINEERING)

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

SECTOR – 12, CHANDIGARH, INDIA

sudo apt update

root@server3: sudo apt install openjdk-8-jdk

Verify that this is installed with

root@server3: java -version

You'll see output like this:

root@server3:readlink -f /usr/bin/java | sed "s:bin/java::"

Verify that the environment variable is set:

root@server3: echo $JAVA_HOME

Step 2 – Create User for Haddop

$ sudo addgroup hadoop

server1@server3: sudo adduser hduser sudo

root@server3: sudo apt-get install ssh

Check if ssh works,

server1@server3: wget http://www-eu.apache.org/dist/hadoop/common/hadoop-

server1@server3: tar xzf hadoop-3.1.2.tar.gz

3.2 Hadoop Configuration

server1@server3: sudo mkdir -p /usr/local/hadoop

Now open the ~/.bashrc file

hduser@server3:~$ sudo gedit ~/.bashrc

#HADOOP VARIABLES START

#HADOOP VARIABLES END

hduser@server3 :~$ sudo mkdir -p /app/hadoop/tmp

Append the following between configuration tags. Same as

<description>A base for other temporary directories.</description>

hduser@server3 sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode

hduser@server3 sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode

hduser@server3 sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode

hduser@server3 sudo chown -R hduser:hadoop /usr/local/hadoop_store

Open the file,

hduser@server3 sudo gedit /usr/local/hadoop/etc/hadoop/hdfs-site.xml

<description>Default block replication.The actual number of replications can be specified when

hduser@server3 :~$ sudo gedit /usr/local/hadoop/etc/hadoop/yarn-site.xml

hduser@server3 :~$ hadoop namenode -format

STEP 7 – IF you want to Stop Hadoop daemons

hduser@server3 :/usr/local/hadoop$ hadoop jar ./share/hadoop/mapreduce/hadoop-

You might also like