0% found this document useful (0 votes)

205 views16 pages

Hadoop Installation Steps

The document provides steps to install and configure Hadoop on Windows. It includes downloading the Hadoop binary, unpacking it, installing native libraries, configuring environment variables, and modifying configuration files. The key steps are: 1) Downloading the Hadoop binary package from the Apache website and saving it locally. 2) Unpacking the downloaded package using a tool like 7-Zip to extract files. 3) Downloading pre-built native libraries from GitHub to enable native support on Windows. 4) Configuring environment variables like JAVA_HOME and HADOOP_HOME to set paths for Java and Hadoop installations.

Uploaded by

Srinivasa Rao T

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

205 views16 pages

Hadoop Installation Steps

Uploaded by

Srinivasa Rao T

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 16

Step 1 - Download Hadoop binary package

Select download mirror link

Go to download page of the official website:

Apache Download Mirrors - Hadoop 3.2.1

And then choose one of the mirror link. The page lists the mirrors closest to you
based on your location. For me, I am choosing the following mirror link:

http://apache.mirror.digitalpacific.com.au/hadoop/common/hadoop-3.2.1/hadoop-
3.2.1.tar.gz

Download the package

I am installing Hadoop in folder big-data of my F drive (F:\big-data). If you prefer to

install on another drive, please remember to change the path accordingly in the
following command lines. This directory is also called destination directory in the
following sections.
Open PowerShell and then run the following command lines one by one:

$dest_dir="F:\big-data"
$url = "http://apache.mirror.digitalpacific.com.au/hadoop/common/hadoop-3.2.1/hadoop-
3.2.1.tar.gz"
$client = new-object System.Net.WebClient
$client.DownloadFile($url,$dest_dir+"\hadoop-3.2.1.tar.gz")
It may take a few minutes to download.
Once the download completes, you can verify it:

PS F:\big-data> cd $dest_dir
PS F:\big-data> ls

Directory: F:\big-data

Mode LastWriteTime Length Name

---- ------------- ------ ----
-a---- 18/01/2020 11:01 AM 359196911 hadoop-3.2.1.tar.gz

PS F:\big-data>
You can also directly download the package through your web browser and save it to
the destination directory.

Step 2 - Unpack the package

Now we need to unpack the downloaded package using GUI tool (like 7 Zip) or
command line. For me, I will use git bash to unpack it.

Open git bash and change the directory to the destination folder:

cd F:/big-data
And then run the following command to unzip:

tar -xvzf hadoop-3.2.1.tar.gz

The command will take quite a few minutes as there are numerous files included and
the latest version introduced many new features.

After the unzip command is completed, a new folder hadoop-3.2.1 is created under

the destination folder.
When running the command you will experience errors like the following:

tar: hadoop-3.2.1/lib/native/libhadoop.so: Cannot create symlink to ‘libhadoop.so.1.0.0’: No such

file or directory
Please ignore it for now as those native libraries are for Linux/UNIX and we will
create Windows native IO libraries in the following steps.

Step 3 - Install Hadoop native IO binary

Hadoop on Linux includes optional Native IO support. However Native IO is

mandatory on Windows and without it you will not be able to get your installation
working. The Windows native IO libraries are not included as part of Apache Hadoop
release. Thus we need to build and install it.

I also published another article with very detailed steps about how to compile and
build native Hadoop on Windows: Compile and Build Hadoop 3.2.1 on Windows 10
Guide.

The build may take about one hourand to save our time, we can just download the
binary package from github.

infoThe following repository already pre-built Hadoop Windows native libraries

for us:
https://github.com/cdarlint/winutils

warning These libraries are not signed and there is no guarantee that it is

100% safe. We use if purely for test&learn purpose.
Download all the files in the following location and save them to the bin folder under
Hadoop folder. For my environment, the full path is: F:\big-data\hadoop-3.2.1\bin.
Remember to change it to your own path accordingly.

https://github.com/cdarlint/winutils/tree/master/hadoop-3.2.1/bin

Alternatively, you can run the following commands in the previous PowerShell
window to download:

$client.DownloadFile("https://github.com/cdarlint/winutils/raw/master/hadoop-3.2.1/bin/
hadoop.dll",$dest_dir+"\hadoop-3.2.1\bin\"+"hadoop.dll")
$client.DownloadFile("https://github.com/cdarlint/winutils/raw/master/hadoop-3.2.1/bin/
hadoop.exp",$dest_dir+"\hadoop-3.2.1\bin\"+"hadoop.exp")
$client.DownloadFile("https://github.com/cdarlint/winutils/raw/master/hadoop-3.2.1/bin/
hadoop.lib",$dest_dir+"\hadoop-3.2.1\bin\"+"hadoop.lib")
$client.DownloadFile("https://github.com/cdarlint/winutils/raw/master/hadoop-3.2.1/bin/
hadoop.pdb",$dest_dir+"\hadoop-3.2.1\bin\"+"hadoop.pdb")
$client.DownloadFile("https://github.com/cdarlint/winutils/raw/master/hadoop-3.2.1/bin/
libwinutils.lib",$dest_dir+"\hadoop-3.2.1\bin\"+"libwinutils.lib")
$client.DownloadFile("https://github.com/cdarlint/winutils/raw/master/hadoop-3.2.1/bin/
winutils.exe",$dest_dir+"\hadoop-3.2.1\bin\"+"winutils.exe")
$client.DownloadFile("https://github.com/cdarlint/winutils/raw/master/hadoop-3.2.1/bin/
winutils.pdb",$dest_dir+"\hadoop-3.2.1\bin\"+"winutils.pdb")
After this, the bin folder looks like the following:

Step 4 - (Optional) Java JDK installation

Java JDK is required to run Hadoop. If you have not installed Java JDK please install
it.

You can install JDK 8 from the following page:

https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-
2133151.html

Once you complete the installation, please run the following command in PowerShell
or Git Bash to verify:

$ java -version
java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)
If you got error about 'cannot find java command or executable'. Don't worry we will
resolve this in the following step.

Step 5 - Configure environment variables

Now we've downloaded and unpacked all the artefacts we need to configure two
important environment variables.

Configure JAVA_HOME environment variable

As mentioned earlier, Hadoop requires Java and we need to

configure JAVA_HOME environment variable (though it is not mandatory but I
recommend it).

First, we need to find out the location of Java SDK. In my system, the path is: D:\
Java\jdk1.8.0_161.

Your location can be different depends on where you install your JDK.

And then run the following command in the previous PowerShell window:

SETX JAVA_HOME "D:\Java\jdk1.8.0_161"

Remember to quote the path especially if you have spaces in your JDK path.
You can setup evironment variable at system by adding option /M however just in
case you don't have access to change system variables, you can just set it up at
user level.
The output looks like the following:

Configure HADOOP_HOME environment variable

Similarly we need to create a new environment variable for HADOOP_HOME using

the following command. The path should be your extracted Hadoop folder. For my
environment it is: F:\big-data\hadoop-3.2.1.

If you used PowerShell to download and if the window is still open, you can simply
run the following command:

SETX HADOOP_HOME $dest_dir+"/hadoop-3.2.1"

The output looks like the following screenshot:

Alternatively, you can specify the full path:

SETX HADOOP_HOME "F:\big-data\hadoop-3.2.1"

Now you can also verify the two environment variables in the system:
Configure PATH environment variable

Once we finish setting up the above two environment variables, we need to add
the bin folders to the PATH environment variable.

If PATH environment exists in your system, you can also manually add the following
two paths to it:

 %JAVA_HOME%/bin
 %HADOOP_HOME%/bin

Alternatively, you can run the following command to add them:

setx PATH "$env:PATH;$env:JAVA_HOME/bin;$env:HADOOP_HOME/bin"

If you don't have other user variables setup in the system, you can also directly add
a Path environment variable that references others to make it short:

Close PowerShell window and open a new one and type winutils.exe directly to verify that
our above steps are completed successfully:
You should also be able to run the following command:

hadoop -version
java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)
Step 6 - Configure Hadoop

Now we are ready to configure the most important part - Hadoop configurations
which involves Core, YARN, MapReduce, HDFS configurations.

Configure core site

Edit file core-site.xml in %HADOOP_HOME%\etc\hadoop folder. For my

environment, the actual path is F:\big-data\hadoop-3.2.1\etc\hadoop.

Replace configuration element with the following:

<configuration>
   <property>
     <name>fs.default.name</name>
     <value>hdfs://0.0.0.0:19000</value>
   </property>
</configuration>
Configure HDFS

Edit file hdfs-site.xml in %HADOOP_HOME%\etc\hadoop folder.

Before editing, please correct two folders in your system: one for namenode
directory and another for data directory. For my system, I created the following two
sub folders:

 F:\big-data\data\dfs\namespace_logs
 F:\big-data\data\dfs\data

Replace configuration element with the following (remember to replace

the highlighted paths accordingly):

<configuration>
   <property>
     <name>dfs.replication</name>
     <value>1</value>
   </property>
   <property>
     <name>dfs.namenode.name.dir</name>
     <value>file:///F:/big-data/data/dfs/namespace_logs</value>
   </property>
   <property>
     <name>dfs.datanode.data.dir</name>
     <value>file:///F:/big-data/data/dfs/data</value>
   </property>
</configuration>
In Hadoop 3, the property names are slightly different from previous version. Refer to
the following official documentation to learn more about the configuration properties:

Hadoop 3.2.1 hdfs_default.xml

For DFS replication we configure it as one as we are configuring just one single
node. By default the value is 3.
The directory configuration are not mandatory and by default it will use Hadoop
temporary folder. For our tutorial purpose, I would recommend customise the
values.
Configure MapReduce and YARN site

Edit file mapred-site.xml in %HADOOP_HOME%\etc\hadoop folder.

Replace configuration element with the following:

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>%HADOOP_HOME%/share/hadoop/mapreduce/*,%HADOOP_HOME%/share/
hadoop/mapreduce/lib/*,%HADOOP_HOME%/share/hadoop/common/*,%HADOOP_HOME%/
share/hadoop/common/lib/*,%HADOOP_HOME%/share/hadoop/yarn/*,%HADOOP_HOME%/
share/hadoop/yarn/lib/*,%HADOOP_HOME%/share/hadoop/hdfs/*,%HADOOP_HOME%/
share/hadoop/hdfs/lib/*</value>
</property>
</configuration>
Edit file yarn-site.xml in %HADOOP_HOME%\etc\hadoop folder.

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>

<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF
_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_H
OME</value>
</property>
</configuration>
Step 7 - Initialise HDFS & bug fix

Run the following command in Command Prompt

hdfs namenode -format

This command failed with the following error and we need to fix it:

2020-01-18 13:36:03,021 ERROR namenode.NameNode: Failed to start namenode.

java.lang.UnsupportedOperationException
at java.nio.file.Files.setPosixFilePermissions(Files.java:2044)
at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:
452)
at org.apache.hadoop.hdfs.server.namenode.NNStorage.format(NNStorage.java:591)
at org.apache.hadoop.hdfs.server.namenode.NNStorage.format(NNStorage.java:613)
at org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:188)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1206)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1649)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1759)
2020-01-18 13:36:03,025 INFO util.ExitUtil: Exiting with status 1:
java.lang.UnsupportedOperationException

Refer to the following sub section (About 3.2.1 HDFS bug on Windows) about
the details of fixing this problem.

Once this is fixed, the format command (hdfs namenode -format) will show
something like the following:
About 3.2.1 HDFS bug on Windows

This is a bug with 3.2.1 release:

https://issues.apache.org/jira/browse/HDFS-14890

It will be resolved in version 3.2.2 and 3.3.0.

We can apply a temporary fix as the following change diff shows:

Code fix for HDFS-14890

I've done the following to get this temporarily fixed before 3.2.2/3.3.0 is released:

 Checkout the source code of Hadoop project from GitHub.

 Checkout branch 3.2.1
 Open pom file of hadoop-hdfs project
 Update class StorageDirectory as described in the above code diff screen
shot:

if (permission != null) {
try {
Set<PosixFilePermission> permissions =
PosixFilePermissions.fromString(permission.toString());
Files.setPosixFilePermissions(curDir.toPath(), permissions);
} catch (UnsupportedOperationException uoe) {
// Default to FileUtil for non posix file systems
FileUtil.setPermission(curDir, permission);
}
}

 Use Maven to rebuild this project as the following screenshot shows:

Fix bug HDFS-14890

I've uploaded the JAR file into the following location. Please download it from the
following link:

https://github.com/FahaoTang/big-data/blob/master/hadoop-hdfs-3.2.1.jar

And then rename the file name hadoop-hdfs-3.2.1.jar to hadoop-hdfs-3.2.1.bk in

folder %HADOOP_HOME%\share\hadoop\hdfs.

Copy the downloaded hadoop-hdfs-3.2.1.jar to folder %HADOOP_HOME%\share\

hadoop\hdfs.

This is just a temporary fix before the official improvement is published. I publish it
purely for us to complete the whole installation process and there is no guarantee
this temporary fix won't cause any new issue.
Refer to this article for more details about how to build a native Windows
Hadoop: Compile and Build Hadoop 3.2.1 on Windows 10 Guide.

Step 8 - Start HDFS daemons

Run the following command to start HDFS daemons in Command Prompt:

%HADOOP_HOME%\sbin\start-dfs.cmd
Two Command Prompt windows will open: one for datanode and another for
namenode as the following screenshot shows:

Step 9 - Start YARN daemons

You may encounter permission issues if you start YARN daemons using normal
user. To ensure you don't encounter any issues. Please open a Command Prompt
window using Run as administrator.
Alternatively, you can follow this comment on this page which doesn't require
Administrator permission using a local Windows account:
https://kontext.tech/column/hadoop/377/latest-hadoop-321-installation-on-windows-
10-step-by-step-guide#comment314
Run the following command in an elevated Command Prompt window (Run as
administrator) to start YARN daemons:

%HADOOP_HOME%\sbin\start-yarn.cmd
Similarly two Command Prompt windows will open: one for resource manager and
another for node manager as the following screenshot shows:
Step 10 - Useful Web portals exploration

The daemons also host websites that provide useful information about the cluster.

HDFS Namenode information UI

http://localhost:9870/dfshealth.html#tab-overview
The website looks like the following screenshot:
HDFS Datanode information UI

http://localhost:9864/datanode.html
The website looks like the following screenshot:
YARN resource manager UI

http://localhost:8088
The website looks like the following screenshot:

Step 11 - Shutdown YARN & HDFS daemons

You don't need to keep the services running all the time. You can stop them by
running the following commands one by one:

%HADOOP_HOME%\sbin\stop-yarn.cmd
%HADOOP_HOME%\sbin\stop-dfs.cmd
Congratulations! You've successfully completed the installation of Hadoop 3.2.1 on
Windows 10.
Let me know if you encounter any issues. Enjoy with your latest Hadoop on Windows
10.

CCS334 BDA Lab Manual
No ratings yet
CCS334 BDA Lab Manual
35 pages
Big Data & Analytics Lab Manual
No ratings yet
Big Data & Analytics Lab Manual
51 pages
Cambridge o Level Commerce Coursebook
0% (1)
Cambridge o Level Commerce Coursebook
9 pages
GEFIL1 SIM Week 7-9 Mohinog PDF
0% (1)
GEFIL1 SIM Week 7-9 Mohinog PDF
40 pages
LogicEditor enUS
No ratings yet
LogicEditor enUS
254 pages
BDA Lab Manual R22
0% (1)
BDA Lab Manual R22
70 pages
CC MCQ
No ratings yet
CC MCQ
28 pages
Cloud Computing Architecture
100% (1)
Cloud Computing Architecture
4 pages
Lab Lec 1a - Laboratory Rules and Safety Precautions
No ratings yet
Lab Lec 1a - Laboratory Rules and Safety Precautions
52 pages
A Brief Overview of Artificial Intelligence
No ratings yet
A Brief Overview of Artificial Intelligence
2 pages
What Is Cloud Computing Reference Model
100% (1)
What Is Cloud Computing Reference Model
3 pages
OLAP2
No ratings yet
OLAP2
53 pages
VTP Interview Questions and Answers (VLAN Trunking Protocol) - Networker Interview
100% (1)
VTP Interview Questions and Answers (VLAN Trunking Protocol) - Networker Interview
2 pages
CAPE Computer Science Unit 1 - Proposal
No ratings yet
CAPE Computer Science Unit 1 - Proposal
2 pages
CPM18th Care of Older Persons
No ratings yet
CPM18th Care of Older Persons
11 pages
How To Install Hadoop in Windows 10 & 11 - Hadoop Installation
No ratings yet
How To Install Hadoop in Windows 10 & 11 - Hadoop Installation
9 pages
Cloud Computing Deployment Models
No ratings yet
Cloud Computing Deployment Models
5 pages
213nt1306 - Big Data Analytics Lab Manual
No ratings yet
213nt1306 - Big Data Analytics Lab Manual
80 pages
Ikeja Electric PLC's Financial Statement For Statutory Report
No ratings yet
Ikeja Electric PLC's Financial Statement For Statutory Report
76 pages
Final Copy - BDA LAB Record
No ratings yet
Final Copy - BDA LAB Record
44 pages
Hadoop 3 Installation
No ratings yet
Hadoop 3 Installation
10 pages
Mongodb MCQ
No ratings yet
Mongodb MCQ
3 pages
MSC Comp SC Syllabus Cbcs 09072016
No ratings yet
MSC Comp SC Syllabus Cbcs 09072016
37 pages
Bda Lab Record
No ratings yet
Bda Lab Record
60 pages
New Bda Manual
No ratings yet
New Bda Manual
80 pages
Experiment: - 1: Aim: Installing Hadoop, Configure HDFS, Configuring Hadoop
No ratings yet
Experiment: - 1: Aim: Installing Hadoop, Configure HDFS, Configuring Hadoop
67 pages
A New Way To PFC and An Even Better Way To LLC
No ratings yet
A New Way To PFC and An Even Better Way To LLC
30 pages
Bda Record
No ratings yet
Bda Record
83 pages
FITA - Academy - UI UX Design
No ratings yet
FITA - Academy - UI UX Design
17 pages
Big Data Journal
No ratings yet
Big Data Journal
50 pages
Big Data Lab Record
No ratings yet
Big Data Lab Record
30 pages
Hadoop Record 2024-Final
No ratings yet
Hadoop Record 2024-Final
59 pages
TPEditor V1.10 Manual
No ratings yet
TPEditor V1.10 Manual
100 pages
Big Data Security 20100BTCSDSI07268
No ratings yet
Big Data Security 20100BTCSDSI07268
76 pages
Unit 1 Bdhall
No ratings yet
Unit 1 Bdhall
66 pages
Hadoop Installation Step by Step
No ratings yet
Hadoop Installation Step by Step
8 pages
Hadoop Installation
No ratings yet
Hadoop Installation
17 pages
Mongodb Indexes
No ratings yet
Mongodb Indexes
31 pages
Manual Bomba Horizontal Clase D PDF
No ratings yet
Manual Bomba Horizontal Clase D PDF
24 pages
Lab Manual
No ratings yet
Lab Manual
34 pages
Big Data
No ratings yet
Big Data
32 pages
BD Lab File
No ratings yet
BD Lab File
39 pages
Unit 2 - School - Keys
No ratings yet
Unit 2 - School - Keys
15 pages
Dynamic Programming: How Can We Calculate F (20) ?
No ratings yet
Dynamic Programming: How Can We Calculate F (20) ?
10 pages
ICTU SurveyQuestionnaire SB
No ratings yet
ICTU SurveyQuestionnaire SB
2 pages
CCS334-BDA LAB MANUAL Final
No ratings yet
CCS334-BDA LAB MANUAL Final
46 pages
Bigdata Manual Final
No ratings yet
Bigdata Manual Final
65 pages
2018 HotelMarketingGuide FINAL
No ratings yet
2018 HotelMarketingGuide FINAL
12 pages
Diy Drone and Quadcopter Projects The Editors of PDF Download
No ratings yet
Diy Drone and Quadcopter Projects The Editors of PDF Download
41 pages
Instructions: Meet DRU - The World's First Pizza Delivery Robot!
No ratings yet
Instructions: Meet DRU - The World's First Pizza Delivery Robot!
9 pages
Bigdatamanual
No ratings yet
Bigdatamanual
45 pages
BDH Lab Manual FINAL (Hadoop)
No ratings yet
BDH Lab Manual FINAL (Hadoop)
29 pages
BIG Data File
No ratings yet
BIG Data File
28 pages
Nelder Mead Slides
No ratings yet
Nelder Mead Slides
47 pages
Step 1: Download Binary Package
No ratings yet
Step 1: Download Binary Package
50 pages
Ba Lab Record-It b2022-26
No ratings yet
Ba Lab Record-It b2022-26
43 pages
BDA Lab Manual by T.Naga Praveena
No ratings yet
BDA Lab Manual by T.Naga Praveena
40 pages
Big Data File
No ratings yet
Big Data File
32 pages
Hadoop 1
No ratings yet
Hadoop 1
39 pages
Anushka Shetty 35
No ratings yet
Anushka Shetty 35
34 pages
Bda Manual
No ratings yet
Bda Manual
33 pages
Hive INstallation
No ratings yet
Hive INstallation
13 pages
Big Data Manual Ai
No ratings yet
Big Data Manual Ai
33 pages
INSIDE OUT - Reaction Paper
No ratings yet
INSIDE OUT - Reaction Paper
1 page
Practical N0.2 AIM: Install Hadoop Hadoop Installation On Windows 10
No ratings yet
Practical N0.2 AIM: Install Hadoop Hadoop Installation On Windows 10
12 pages
2.multiple Currencies in Purchase Order Release Strategy
No ratings yet
2.multiple Currencies in Purchase Order Release Strategy
4 pages
Big Data
No ratings yet
Big Data
28 pages
Unit 4 Unit 4 Bda
No ratings yet
Unit 4 Unit 4 Bda
16 pages
Install and Run Hadoop On Windows
No ratings yet
Install and Run Hadoop On Windows
29 pages
Project-Description-for-Scoping MCTEP
No ratings yet
Project-Description-for-Scoping MCTEP
33 pages
Hadoop Installation Process
No ratings yet
Hadoop Installation Process
16 pages
Hadoop On Windows
No ratings yet
Hadoop On Windows
13 pages
Data Analytics Lab
No ratings yet
Data Analytics Lab
9 pages
MCA Admission List
No ratings yet
MCA Admission List
3 pages
Setup Hadoop On Windows 10 Machines
No ratings yet
Setup Hadoop On Windows 10 Machines
4 pages
HDFS Installation Steps
No ratings yet
HDFS Installation Steps
17 pages
Resume: Mamatha.K
No ratings yet
Resume: Mamatha.K
2 pages
Ba1 2
No ratings yet
Ba1 2
15 pages
Note Taking
No ratings yet
Note Taking
1 page
Big Data Analytics Lab Experiments
No ratings yet
Big Data Analytics Lab Experiments
16 pages
HADOOP
No ratings yet
HADOOP
6 pages
Asian Countries
No ratings yet
Asian Countries
4 pages
Amc Engineering College: Dept. of Computer Science and Engineering
No ratings yet
Amc Engineering College: Dept. of Computer Science and Engineering
6 pages
BDA1
No ratings yet
BDA1
7 pages
Typical Vs Atypical Antipsychotics
No ratings yet
Typical Vs Atypical Antipsychotics
6 pages
EX1-Installation of Hadoop
No ratings yet
EX1-Installation of Hadoop
6 pages
Big Data 1
No ratings yet
Big Data 1
2 pages
Hadoop Installation
No ratings yet
Hadoop Installation
12 pages
Steps of Hadoop Installation
No ratings yet
Steps of Hadoop Installation
3 pages
Post Colonial Literature Assignment
No ratings yet
Post Colonial Literature Assignment
3 pages
LG Dry Contact (Only AC 24V) : Installation Manual
No ratings yet
LG Dry Contact (Only AC 24V) : Installation Manual
11 pages
Hadoop 2.3 Installation For Windows
No ratings yet
Hadoop 2.3 Installation For Windows
6 pages
2023 Navigational Stars SHA & Dec
No ratings yet
2023 Navigational Stars SHA & Dec
6 pages
Sqoop Tutorial: Sqoop: "SQL To Hadoop and Hadoop To SQL"
No ratings yet
Sqoop Tutorial: Sqoop: "SQL To Hadoop and Hadoop To SQL"
11 pages
Hadoop Installation Guide
No ratings yet
Hadoop Installation Guide
10 pages
Experiment 1 Hadoop Installation
No ratings yet
Experiment 1 Hadoop Installation
6 pages
Prerequisites
No ratings yet
Prerequisites
2 pages
Linearizing Effect Regenerative Feedback
No ratings yet
Linearizing Effect Regenerative Feedback
3 pages
Sample Essay 1 - MLA Format
No ratings yet
Sample Essay 1 - MLA Format
3 pages
R Programming Presentation Schedule BCA Section - A
No ratings yet
R Programming Presentation Schedule BCA Section - A
1 page
Hadoop On Windows
No ratings yet
Hadoop On Windows
6 pages
TX - L-Band - LC12 2150A
No ratings yet
TX - L-Band - LC12 2150A
1 page
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
From Everand
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
Dr. Hidaia Mahmood Alassouli
No ratings yet

Hadoop Installation Steps

Uploaded by

Hadoop Installation Steps

Uploaded by

Step 1 - Download Hadoop binary package

Select download mirror link

Go to download page of the official website:

Apache Download Mirrors - Hadoop 3.2.1

Download the package

I am installing Hadoop in folder big-data of my F drive (F:\big-data). If you prefer to

Mode LastWriteTime Length Name

Step 2 - Unpack the package

tar -xvzf hadoop-3.2.1.tar.gz

After the unzip command is completed, a new folder hadoop-3.2.1 is created under

tar: hadoop-3.2.1/lib/native/libhadoop.so: Cannot create symlink to ‘libhadoop.so.1.0.0’: No such

Step 3 - Install Hadoop native IO binary

Hadoop on Linux includes optional Native IO support. However Native IO is

infoThe following repository already pre-built Hadoop Windows native libraries

warning These libraries are not signed and there is no guarantee that it is

Step 4 - (Optional) Java JDK installation

You can install JDK 8 from the following page:

Step 5 - Configure environment variables

Configure JAVA_HOME environment variable

As mentioned earlier, Hadoop requires Java and we need to

SETX JAVA_HOME "D:\Java\jdk1.8.0_161"

Configure HADOOP_HOME environment variable

Similarly we need to create a new environment variable for HADOOP_HOME using

SETX HADOOP_HOME $dest_dir+"/hadoop-3.2.1"

Alternatively, you can specify the full path:

SETX HADOOP_HOME "F:\big-data\hadoop-3.2.1"

Alternatively, you can run the following command to add them:

setx PATH "$env:PATH;$env:JAVA_HOME/bin;$env:HADOOP_HOME/bin"

Configure core site

Edit file core-site.xml in %HADOOP_HOME%\etc\hadoop folder. For my

Replace configuration element with the following:

Replace configuration element with the following (remember to replace

Hadoop 3.2.1 hdfs_default.xml

Replace configuration element with the following:

Run the following command in Command Prompt

hdfs namenode -format

2020-01-18 13:36:03,021 ERROR namenode.NameNode: Failed to start namenode.

This is a bug with 3.2.1 release:

It will be resolved in version 3.2.2 and 3.3.0.

We can apply a temporary fix as the following change diff shows:

Code fix for HDFS-14890

 Checkout the source code of Hadoop project from GitHub.

 Use Maven to rebuild this project as the following screenshot shows:

And then rename the file name hadoop-hdfs-3.2.1.jar to hadoop-hdfs-3.2.1.bk in

Copy the downloaded hadoop-hdfs-3.2.1.jar to folder %HADOOP_HOME%\share\

Step 8 - Start HDFS daemons

Run the following command to start HDFS daemons in Command Prompt:

Step 9 - Start YARN daemons

HDFS Namenode information UI

Step 11 - Shutdown YARN & HDFS daemons

You might also like