0% found this document useful (0 votes)
27 views3 pages

Ex 3

Uploaded by

asda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views3 pages

Ex 3

Uploaded by

asda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 3

1.

Stop all hadoop 1x processes


stop-all from master

2. remove and repoint the symlink (as ubuntu user)


sudo rm /opt/hadoop
sudo ln -s /usr/local/hadoop-2.6.0 /opt/hadoop

3. Edit .profile (check the following entries)


cat /home/hadoop/.profile

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_HOME=/opt/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop

4. Create Temp folder in HADOOP_HOME


$ mkdir -p $HADOOP_HOME/tmp

5. Make changes as mentioned below in all the machines:

$HADOOP_CONF_DIR/core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/tmp</value>
</property>
</configuration>

$HADOOP_CONF_DIR/hdfs-site.xml :

<?xml version="1.0" encoding="UTF-8"?>


<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.block.size</name>
<value>4194304</value>
</property>
</configuration>

$HADOOP_CONF_DIR/mapred-site.xml :
<?xml version="1.0"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

$HADOOP_CONF_DIR/yarn-site.xml :

<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8025</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8040</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
</configuration>

6. Add slaves
Add the slave entries in $HADOOP_CONF_DIR/slaves on all machines machine:
slave1
slave2
slave3

7. Format the namenode on master


$ bin/hadoop namenode -format

8. Start Hadoop Daemons on master


$ sbin/hadoop-daemon.sh start namenode
$ sbin/hadoop-daemons.sh start datanode
$ sbin/yarn-daemon.sh start resourcemanager
$ sbin/yarn-daemons.sh start nodemanager
$ sbin/mr-jobhistory-daemon.sh start historyserver

on slaves
$ sbin/yarn-daemon.sh start nodemanager
$ sbin/hadoop-daemons.sh start datanode
9. Check for jps output on slaves and master.
For master:
$ jps
6539 ResourceManager
6451 DataNode
8701 Jps
6895 JobHistoryServer
6234 NameNode
6765 NodeManager

For slaves:
$ jps
8014 NodeManager
7858 DataNode
9868 Jps

10. Create sample file to test


$ mkdir input
$ cat > input/file
This is one line
This is another one

11. Add this directory to HDFS:


$ bin/hadoop dfs -copyFromLocal input /input
bin/hdfs dfs -copyFromLocal input /input

12. Run sample program


bin/hadoop jar
share/hadoop/mapreduce/hadoop-mapreduce-examples-2.*.jar
wordcount /input /output

13. Verify output


bin/hadoop dfs -cat /output/*

14. Check the urls for the cluster health, hdfs, and jobhistory
1. http://master:50070/dfshealth.jsp
2. http://master:8088/cluster
3. http://master:19888/jobhistory (for Job History Server)

15. Verify that the output is generated

16. Pull down the output to local machine


hdfs dfs -copyToLocal output/part-r-00000

17. Stop the hadoop daemons


$ sbin/mr-jobhistory-daemon.sh stop historyserver
$ sbin/yarn-daemons.sh stop nodemanager
$ sbin/yarn-daemon.sh stop resourcemanager
$ sbin/hadoop-daemons.sh stop datanode
$ sbin/hadoop-daemon.sh stop namenode

You might also like