0% found this document useful (0 votes)
76 views30 pages

Yarn Tutorial PDF

This document provides instructions for installing YARN (Hadoop 2.x.x) on a system, including modifying configuration files, formatting the namenode, starting HDFS and YARN services, running a MapReduce job, and accessing the HDFS, resource manager, and job history server UIs. Key steps are to download and extract Hadoop, modify configuration files like core-site.xml and hdfs-site.xml, format the namenode, start HDFS and YARN, copy data to HDFS, run a job, and stop the services.

Uploaded by

vishnu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views30 pages

Yarn Tutorial PDF

This document provides instructions for installing YARN (Hadoop 2.x.x) on a system, including modifying configuration files, formatting the namenode, starting HDFS and YARN services, running a MapReduce job, and accessing the HDFS, resource manager, and job history server UIs. Key steps are to download and extract Hadoop, modify configuration files like core-site.xml and hdfs-site.xml, format the namenode, start HDFS and YARN, copy data to HDFS, run a job, and stop the services.

Uploaded by

vishnu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

YARN ( Hadoop 2.x.

x )

For online Hadoop training, send mail to [email protected]


Agenda
Hadoop 1.0 Vs Hadoop 2.0
Install YARN ( Hadoop 2.x.x ) on your system
Format your Namenode
Start HDFS
Start YARN
Start Job history server
Copy data from local to HDFS
Execute Hadoop job on YARN
HDFS UI
MR UI
Job history server UI
MR1 Vs MR2
Job Tracker
Vs
Resource Manager/Application Master
Hadoop 1.0 Vs Hadoop 2.0
Download Hadoop from Apache website
Click on stable directory
Click on tar.gz file
Save tar.gz file on your system
Extract the content of tar.gz file
Directory structure of Hadoop

HADOOP_HOME

sbin [ contains scripts to start/stop Hadoop ]

bin [ contains HDFS & Map-Reduce commands ]

etc/hadoop [ contains configuration files of Hadoop ]

lib [ contains libraries/JARs to run Hadoop ]

logs [ contains log files ]


Modify etc/hadoop/hadoop-env.sh

export JAVA_HOME=/usr

export HADOOP_HOME=/home/neeraj/Desktop/hadoop-2.6.0

export HADOOP_MAPRED_HOME=/home/neeraj/Desktop/hadoop-2.6.0

export HADOOP_COMMON_HOME=/home/neeraj/Desktop/hadoop-2.6.0

export HADOOP_HDFS_HOME=/home/neeraj/Desktop/hadoop-2.6.0

export HADOOP_YARN_HOME=/home/neeraj/Desktop/hadoop-2.6.0

export HADOOP_CONF_DIR=/home/neeraj/Desktop/hadoop-2.6.0
/etc/hadoop
Modify etc/hadoop/core-site.xml

<configuration>

<property>
<name>fs.default.name</name>
<value>hdfs://myubuntu:9000</value>
</property>

<property>
<name>hadoop.tmp.dir</name>
<value>/home/neeraj/Desktop/HDFS2/hdfs_temp</value>
</property>

</configuration>
Modify etc/hadoop/hdfs-site.xml
<configuration>

<property>
<name>dfs.replication</name>
<value>1</value>
</property>

<property>
<name>dfs.name.dir</name>
<value>/home/neeraj/Desktop/HDFS2/hdfs_metadata</value>
</property>

<property>
<name>dfs.data.dir</name>
<value>/home/neeraj/Desktop/HDFS2/hdfs_data</value>
</property>

</configuration>
Modify etc/hadoop/mapred-site.xml

<configuration>

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

</configuration>
Modify etc/hadoop/yarn-site.xml

<configuration>

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>

<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

</configuration>
Format your Namenode
./hdfs namenode -format
Start HDFS services
./start-dfs.sh
Start YARN services
./start-yarn.sh
Start job history server
./mr-jobhistory-daemon.sh start historyserver
Copy data from local file system to HDFS

./hdfs dfs -copyFromLocal < local path > < HDFS path >
Run Map-Reduce job on Hadoop cluster
./hadoop jar < jar path > < job class name > < input path >
< non existing output path >
Browse HDFS data on browser
http://localhost:50070
Browse HDFS data on browser
http://localhost:50070
Browse applications on browser
http://localhost:8088
Browse job history on browser
http://localhost:19888
Stop job history server
./mr-jobhistory-daemon.sh stop historyserver
Stop YARN services
./stop-yarn.sh
Stop HDFS services
./stop-dfs.sh
…Thanks…

For online Hadoop training, send mail to [email protected]

You might also like