0% found this document useful (0 votes)
4 views

Installation of Hadoop

This document outlines the steps for installing Hadoop, including prerequisites like Java and SSH, downloading Hadoop, configuring environment variables, and setting up necessary configuration files. It also details how to format the Hadoop file system, start services, and verify the installation through web UIs and a sample job. The instructions are intended for users familiar with command-line operations on a Linux system.

Uploaded by

Kalyan G V
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Installation of Hadoop

This document outlines the steps for installing Hadoop, including prerequisites like Java and SSH, downloading Hadoop, configuring environment variables, and setting up necessary configuration files. It also details how to format the Hadoop file system, start services, and verify the installation through web UIs and a sample job. The instructions are intended for users familiar with command-line operations on a Linux system.

Uploaded by

Kalyan G V
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

AMC ENGINEERING COLLEGE

Dept. Of Computer Science and Engineering


Big Data Analytics [21CS71] Assignement-2
Topic: - Installation of Hadoop Kalyan G V (1AM21CS077)
Installation of Hadoop

Steps for Hadoop Installation:

1. Install Java Development Kit (JDK):

Hadoop requires Java to be installed on your system.

• To check if Java is installed:

java -version

• If Java is not installed, install it using:

sudo apt update


sudo apt install openjdk-8-jdk

• Verify the installation:

java -version

2. Install SSH:

Hadoop uses SSH to communicate between its nodes.

• Install SSH if it is not already present:

sudo apt install openssh-server

• Ensure SSH is running:

sudo systemctl start ssh


sudo systemctl enable ssh
3. Download Hadoop:

• Download Hadoop from the official Apache website:

wget https://downloads.apache.org/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz

• Extract the downloaded tar file:

tar -xzvf hadoop-3.3.1.tar.gz

• Move the extracted folder to /usr/local/hadoop:

sudo mv hadoop-3.3.1 /usr/local/hadoop

4. Configure Hadoop Environment Variables:

• Open the .bashrc file to add Hadoop-related environment variables:

nano ~/.bashrc

• Add the following lines at the end of the file:

export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_HOME=$HADOOP_HOME

• Save and close the file. Then, apply the changes:

source ~/.bashrc

5. Configure Hadoop Files:

Hadoop requires several configuration files to be set up for proper functioning.

• core-site.xml: Navigate to $HADOOP_HOME/etc/hadoop and open core-site.xml:

nano $HADOOP_HOME/etc/hadoop/core-site.xml

Add the following configuration:

<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

• hdfs-site.xml: Edit hdfs-site.xml:

nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml

Add the following configuration:

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
• mapred-site.xml: Edit mapred-site.xml:

cp $HADOOP_HOME/etc/hadoop/mapred-site.xml.template
$HADOOP_HOME/etc/hadoop/mapred-site.xml
nano $HADOOP_HOME/etc/hadoop/mapred-site.xml

Add the following configuration:

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

• yarn-site.xml: Edit yarn-site.xml:

nano $HADOOP_HOME/etc/hadoop/yarn-site.xml

Add the following configuration:

<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:8032</value>
</property>
</configuration>
6. Format the Hadoop File System:

• Format the Hadoop Distributed File System (HDFS) for the first time:

hdfs namenode -format

7. Start Hadoop Services:

• Start the HDFS daemons (Namenode, Datanode):

start-dfs.sh

• Start YARN daemons (ResourceManager, NodeManager):

start-yarn.sh
8. Verify Hadoop Installation:

• Check if Hadoop is running by opening the ResourceManager and Namenode web UIs:
o ResourceManager UI: http://localhost:8088
o Namenode UI: http://localhost:9870
o

o
• Check the status of HDFS:

hdfs dfsadmin -report

• You can also run a simple Hadoop job:

yarn jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar


pi 2 5

You might also like