0% found this document useful (0 votes)
47 views3 pages

Oozie Basic Exercise

The document provides steps to run a MapReduce wordcount example using Oozie workflow on a Hadoop cluster. It involves creating directories on HDFS, uploading the workflow file, example jar, and sample data. Then configuring the workflow.xml file and running the Oozie job to trigger the MapReduce job which counts word occurrences in the sample text data. Finally, checking the job status and output results.

Uploaded by

asda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views3 pages

Oozie Basic Exercise

The document provides steps to run a MapReduce wordcount example using Oozie workflow on a Hadoop cluster. It involves creating directories on HDFS, uploading the workflow file, example jar, and sample data. Then configuring the workflow.xml file and running the Oozie job to trigger the MapReduce job which counts word occurrences in the sample text data. Finally, checking the job status and output results.

Uploaded by

asda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 3

1.

Create a working folder in /home/hadoop


mkdir /home/hadoop/basic-oozie-exercise

2. Extract the mapreduce examples jar


/opt/hadoop/share/hadoop/map-reduce....
jar -xvf <filename.jar>

3. Find the files associated with the wordcount sample

4. Create a job.properties file


nameNode=hdfs://master:9000
jobTracker=master:8050
queueName=default
examplesRoot=examplesoozie#
oozie.wf.application.path=${nameNode}/user/${user.name}
/${examplesRoot}/map-reduce
outputDir=map-reduce#
oozie.libpath=${nameNode}/$(user.name)/share/lib

5. Create a workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.1" name="map-reduce-wf">
<start to="mr-node"/>
<action name="mr-node">
<map-reduce>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/$
{outputDir}"/>
</prepare>

<configuration>
<property>
<name>mapred.mapper.new-api</name>
<value>true</value>
</property>
<property>
<name>mapred.reducer.new-api</name>
<value>true</value>
</property>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
<property>
<name>mapreduce.map.class</name>
<value>org.apache.hadoop.examples.WordCount$TokenizerMapper</value>
</property>
<property>
<name>mapreduce.reduce.class</name>
<value>org.apache.hadoop.examples.WordCount$IntSumReducer</value>
</property>
<property>
<name>mapreduce.combine.class</name>
<value>org.apache.hadoop.examples.WordCount$IntSumReducer</value>
</property>
<property>
<name>mapred.output.key.class</name>
<value>org.apache.hadoop.io.Text</value>
</property>
<property>
<name>mapred.output.value.class</name>
<value>org.apache.hadoop.io.IntWritable</value>
</property>
<property>
<name>mapred.input.dir</name>
<value>/user/${wf:user()}/${examplesRoot}/input-data/text</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>/user/${wf:user()}/${examplesRoot}/output-data/${outputDir}</value>
</property>
</configuration>
</map-reduce>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Map/Reduce failed, error message[$
{wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>

6. Create a directory on HDFS


hdfs dfs -mkdir -p /user/hadoop/examplesoozie#/map-reduce
hdfs dfs -copyFromLocal workflow.xml /user/hadoop/examplesoozie#/map-
reduce/workflow.xml

7. create a folder named lib in which the required library / jar files are kept.
hdfs dfs -mkdir -p /user/hadoop/examplesoozie#/map-reduce/lib

8. copy Hadoop MapReduce examples jar under this directory.


hdfs dfs -copyFromLocal hadoop-mapreduce-examples#.jar
/user/hadoop/examplesoozie#/map-reduce/lib/hadoop-mapreduce-examples#.jar

a. hdfs folder created for program


b. lib folder in program folder containing jar
c. paths adjusted correctly in workflow.xml
d. workflow file uploaded
e. data folder created on hdfs
f. data file uploaded to hdfs
g. export OOZIE_HOME and path in ~/.profile and source the .profile
. ~/.profile

sudo mkdir -p /user/hadoop


cd /user/hadoop
sudo tar -xvzf /opt/oozie-4.2.0/oozie-sharelib-4.2.0.tar.gz
sudo chown -R hadoop:hadoop /user/hadoop

9. run Hadoop MapReduce program for WordCount


oozie job -oozie http://localhost:11000/oozie -config job.properties -run

10. view the status of the job


oozie job -oozie http://localhost:11000/oozie -info <jobid>

11. Review the output in the directory as specified by workflow.xml


hdfs dfs -cat /user/hadoop/examplesoozie#/output-data/map-reduce/part-r-00000
ssh -L 0.0.0.0:8030:master:8030 master
ssh -L 0.0.0.0:10020:master:10020 master

You might also like