Hadoop Installation
Hadoop Installation
1. Java Installation
2. SSH installation
3. Hadoop Installation and File Configuration
1) Java Installation
Step 1. Type "java -version" in prompt to find if the java is installed or not. If not then
download java from
http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-
1880260.html . The tar filejdk-7u71-linux-x64.tar.gz will be downloaded to your
system.
Step 3. To make java available for all the users of UNIX move the file to /usr/local and
set the path. In the prompt switch to root user and then type the command below to
move the jdk to /usr/lib.
1. # mv jdk1.7.0_71 /usr/lib/
Now in ~/.bashrc file add the following commands to set up the path.
1. # export JAVA_HOME=/usr/lib/jdk1.7.0_71
2. # export PATH=PATH:$JAVA_HOME/bin
Now, you can check the installation by typing "java -version" in the prompt.
2) SSH Installation
SSH is used to interact with the master and slaves computer without any prompt for
password. First of all create a Hadoop user on the master and slave systems
1. # useradd hadoop
2. # passwd Hadoop
To map the nodes open the hosts file present in /etc/ folder on all the machines and
put the ip address along with their host name.
1. # vi /etc/hosts
1. 190.12.1.114 hadoop-master
2. 190.12.1.121 hadoop-salve-one
3. 190.12.1.143 hadoop-slave-two
Set up SSH key in every node so that they can communicate among themselves
without password. Commands for the same are:
1. # su hadoop
2. $ ssh-keygen -t rsa
3. $ ssh-copy-id -i ~/.ssh/id_rsa.pub tutorialspoint@hadoop-master
4. $ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop_tp1@hadoop-slave-1
5. $ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop_tp2@hadoop-slave-2
6. $ chmod 0600 ~/.ssh/authorized_keys
7. $ exit
3) Hadoop Installation
Hadoop can be downloaded from
http://developer.yahoo.com/hadoop/tutorial/module3.html
1. $ mkdir /usr/hadoop
2. $ sudo tar vxzf hadoop-2.2.0.tar.gz ?c /usr/hadoop
1. export JAVA_HOME=/usr/lib/jvm/jdk/jdk1.7.0_71
1. <configuration>
2. <property>
3. <name>fs.default.name</name>
4. <value>hdfs://hadoop-master:9000</value>
5. </property>
6. <property>
7. <name>dfs.permissions</name>
8. <value>false</value>
9. </property>
10. </configuration>
1. <configuration>
2. <property>
3. <name>dfs.data.dir</name>
4. <value>usr/hadoop/dfs/name/data</value>
5. <final>true</final>
6. </property>
7. <property>
8. <name>dfs.name.dir</name>
9. <value>usr/hadoop/dfs/name</value>
10. <final>true</final>
11. </property>
12. <property>
13. <name>dfs.replication</name>
14. <value>1</value>
15. </property>
16. </configuration>
1. <configuration>
2. <property>
3. <name>mapred.job.tracker</name>
4. <value>hadoop-master:9001</value>
5. </property>
6. </configuration>
1. cd $HOME
2. vi .bashrc
3. Append following lines in the end and save and exit
4. #Hadoop variables
5. export JAVA_HOME=/usr/lib/jvm/jdk/jdk1.7.0_71
6. export HADOOP_INSTALL=/usr/hadoop
7. export PATH=$PATH:$HADOOP_INSTALL/bin
8. export PATH=$PATH:$HADOOP_INSTALL/sbin
9. export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
10. export HADOOP_COMMON_HOME=$HADOOP_INSTALL
11. export HADOOP_HDFS_HOME=$HADOOP_INSTALL
12. export YARN_HOME=$HADOOP_INSTALL
1. # su hadoop
2. $ cd /opt/hadoop
3. $ scp -r hadoop hadoop-slave-one:/usr/hadoop
4. $ scp -r hadoop hadoop-slave-two:/usr/Hadoop
1. $ vi etc/hadoop/masters
2. hadoop-master
3.
4. $ vi etc/hadoop/slaves
5. hadoop-slave-one
6. hadoop-slave-two
After this format the name node and start all the deamons
1. # su hadoop
2. $ cd /usr/hadoop
3. $ bin/hadoop namenode -format
4.
5. $ cd $HADOOP_HOME/sbin
6. $ start-all.sh