0% found this document useful (0 votes)
6 views5 pages

Hadoop Installation

The document outlines the installation process for Hadoop, detailing the necessary environment and software requirements, including Java and SSH. It provides step-by-step instructions for installing Java, setting up SSH for passwordless communication, and configuring Hadoop on both master and slave nodes. Additionally, it includes configuration file modifications and commands to format the name node and start Hadoop services.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views5 pages

Hadoop Installation

The document outlines the installation process for Hadoop, detailing the necessary environment and software requirements, including Java and SSH. It provides step-by-step instructions for installing Java, setting up SSH for passwordless communication, and configuring Hadoop on both master and slave nodes. Additionally, it includes configuration file modifications and commands to format the name node and start Hadoop services.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Hadoop Installation

Environment required for Hadoop: The production environment of Hadoop is UNIX,


but it can also be used in Windows using Cygwin. Java 1.6 or above is needed to run
Map Reduce Programs. For Hadoop installation from tar ball on the UNIX environment
you need

1. Java Installation
2. SSH installation
3. Hadoop Installation and File Configuration

1) Java Installation
Step 1. Type "java -version" in prompt to find if the java is installed or not. If not then
download java from
http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-
1880260.html . The tar filejdk-7u71-linux-x64.tar.gz will be downloaded to your
system.

Step 2. Extract the file using the below command

1. #tar zxf jdk-7u71-linux-x64.tar.gz

Step 3. To make java available for all the users of UNIX move the file to /usr/local and
set the path. In the prompt switch to root user and then type the command below to
move the jdk to /usr/lib.

1. # mv jdk1.7.0_71 /usr/lib/

Now in ~/.bashrc file add the following commands to set up the path.

1. # export JAVA_HOME=/usr/lib/jdk1.7.0_71
2. # export PATH=PATH:$JAVA_HOME/bin

Now, you can check the installation by typing "java -version" in the prompt.

2) SSH Installation
SSH is used to interact with the master and slaves computer without any prompt for
password. First of all create a Hadoop user on the master and slave systems

1. # useradd hadoop
2. # passwd Hadoop

To map the nodes open the hosts file present in /etc/ folder on all the machines and
put the ip address along with their host name.

1. # vi /etc/hosts

Enter the lines below

1. 190.12.1.114 hadoop-master
2. 190.12.1.121 hadoop-salve-one
3. 190.12.1.143 hadoop-slave-two

Set up SSH key in every node so that they can communicate among themselves
without password. Commands for the same are:

1. # su hadoop
2. $ ssh-keygen -t rsa
3. $ ssh-copy-id -i ~/.ssh/id_rsa.pub tutorialspoint@hadoop-master
4. $ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop_tp1@hadoop-slave-1
5. $ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop_tp2@hadoop-slave-2
6. $ chmod 0600 ~/.ssh/authorized_keys
7. $ exit

3) Hadoop Installation
Hadoop can be downloaded from
http://developer.yahoo.com/hadoop/tutorial/module3.html

Now extract the Hadoop and copy it to a location.

1. $ mkdir /usr/hadoop
2. $ sudo tar vxzf hadoop-2.2.0.tar.gz ?c /usr/hadoop

Change the ownership of Hadoop folder

1. $sudo chown -R hadoop usr/hadoop

Change the Hadoop configuration files:

All the files are present in /usr/local/Hadoop/etc/hadoop


1) In hadoop-env.sh file add

1. export JAVA_HOME=/usr/lib/jvm/jdk/jdk1.7.0_71

2) In core-site.xml add following between configuration tabs,

1. <configuration>
2. <property>
3. <name>fs.default.name</name>
4. <value>hdfs://hadoop-master:9000</value>
5. </property>
6. <property>
7. <name>dfs.permissions</name>
8. <value>false</value>
9. </property>
10. </configuration>

3) In hdfs-site.xmladd following between configuration tabs,

1. <configuration>
2. <property>
3. <name>dfs.data.dir</name>
4. <value>usr/hadoop/dfs/name/data</value>
5. <final>true</final>
6. </property>
7. <property>
8. <name>dfs.name.dir</name>
9. <value>usr/hadoop/dfs/name</value>
10. <final>true</final>
11. </property>
12. <property>
13. <name>dfs.replication</name>
14. <value>1</value>
15. </property>
16. </configuration>

4) Open the Mapred-site.xml and make the change as shown below

1. <configuration>
2. <property>
3. <name>mapred.job.tracker</name>
4. <value>hadoop-master:9001</value>
5. </property>
6. </configuration>

5) Finally, update your $HOME/.bahsrc

1. cd $HOME
2. vi .bashrc
3. Append following lines in the end and save and exit
4. #Hadoop variables
5. export JAVA_HOME=/usr/lib/jvm/jdk/jdk1.7.0_71
6. export HADOOP_INSTALL=/usr/hadoop
7. export PATH=$PATH:$HADOOP_INSTALL/bin
8. export PATH=$PATH:$HADOOP_INSTALL/sbin
9. export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
10. export HADOOP_COMMON_HOME=$HADOOP_INSTALL
11. export HADOOP_HDFS_HOME=$HADOOP_INSTALL
12. export YARN_HOME=$HADOOP_INSTALL

On the slave machine install Hadoop using the command below

1. # su hadoop
2. $ cd /opt/hadoop
3. $ scp -r hadoop hadoop-slave-one:/usr/hadoop
4. $ scp -r hadoop hadoop-slave-two:/usr/Hadoop

Configure master node and slave node

1. $ vi etc/hadoop/masters
2. hadoop-master
3.
4. $ vi etc/hadoop/slaves
5. hadoop-slave-one
6. hadoop-slave-two

After this format the name node and start all the deamons
1. # su hadoop
2. $ cd /usr/hadoop
3. $ bin/hadoop namenode -format
4.
5. $ cd $HADOOP_HOME/sbin
6. $ start-all.sh

You might also like