0% found this document useful (0 votes)

47 views3 pages

Oozie Basic Exercise

The document provides steps to run a MapReduce wordcount example using Oozie workflow on a Hadoop cluster. It involves creating directories on HDFS, uploading the workflow file, example jar, and sample data. Then configuring the workflow.xml file and running the Oozie job to trigger the MapReduce job which counts word occurrences in the sample text data. Finally, checking the job status and output results.

Uploaded by

asda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views3 pages

Oozie Basic Exercise

Uploaded by

asda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 3

1.

Create a working folder in /home/hadoop

mkdir /home/hadoop/basic-oozie-exercise

2. Extract the mapreduce examples jar

/opt/hadoop/share/hadoop/map-reduce....
jar -xvf <filename.jar>

3. Find the files associated with the wordcount sample

4. Create a job.properties file

nameNode=hdfs://master:9000
jobTracker=master:8050
queueName=default
examplesRoot=examplesoozie#
oozie.wf.application.path=${nameNode}/user/${user.name}
/${examplesRoot}/map-reduce
outputDir=map-reduce#
oozie.libpath=${nameNode}/$(user.name)/share/lib

5. Create a workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.1" name="map-reduce-wf">
<start to="mr-node"/>
<action name="mr-node">
<map-reduce>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/$
{outputDir}"/>
</prepare>

<configuration>
<property>
<name>mapred.mapper.new-api</name>
<value>true</value>
</property>
<property>
<name>mapred.reducer.new-api</name>
<value>true</value>
</property>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
<property>
<name>mapreduce.map.class</name>
<value>org.apache.hadoop.examples.WordCount$TokenizerMapper</value>
</property>
<property>
<name>mapreduce.reduce.class</name>
<value>org.apache.hadoop.examples.WordCount$IntSumReducer</value>
</property>
<property>
<name>mapreduce.combine.class</name>
<value>org.apache.hadoop.examples.WordCount$IntSumReducer</value>
</property>
<property>
<name>mapred.output.key.class</name>
<value>org.apache.hadoop.io.Text</value>
</property>
<property>
<name>mapred.output.value.class</name>
<value>org.apache.hadoop.io.IntWritable</value>
</property>
<property>
<name>mapred.input.dir</name>
<value>/user/${wf:user()}/${examplesRoot}/input-data/text</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>/user/${wf:user()}/${examplesRoot}/output-data/${outputDir}</value>
</property>
</configuration>
</map-reduce>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Map/Reduce failed, error message[$
{wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>

6. Create a directory on HDFS

hdfs dfs -mkdir -p /user/hadoop/examplesoozie#/map-reduce
hdfs dfs -copyFromLocal workflow.xml /user/hadoop/examplesoozie#/map-
reduce/workflow.xml

7. create a folder named lib in which the required library / jar files are kept.
hdfs dfs -mkdir -p /user/hadoop/examplesoozie#/map-reduce/lib

8. copy Hadoop MapReduce examples jar under this directory.

hdfs dfs -copyFromLocal hadoop-mapreduce-examples#.jar
/user/hadoop/examplesoozie#/map-reduce/lib/hadoop-mapreduce-examples#.jar

a. hdfs folder created for program

b. lib folder in program folder containing jar
c. paths adjusted correctly in workflow.xml
d. workflow file uploaded
e. data folder created on hdfs
f. data file uploaded to hdfs
g. export OOZIE_HOME and path in ~/.profile and source the .profile
. ~/.profile

sudo mkdir -p /user/hadoop

cd /user/hadoop
sudo tar -xvzf /opt/oozie-4.2.0/oozie-sharelib-4.2.0.tar.gz
sudo chown -R hadoop:hadoop /user/hadoop

9. run Hadoop MapReduce program for WordCount

oozie job -oozie http://localhost:11000/oozie -config job.properties -run

10. view the status of the job

oozie job -oozie http://localhost:11000/oozie -info <jobid>

11. Review the output in the directory as specified by workflow.xml

hdfs dfs -cat /user/hadoop/examplesoozie#/output-data/map-reduce/part-r-00000
ssh -L 0.0.0.0:8030:master:8030 master
ssh -L 0.0.0.0:10020:master:10020 master

Module 5 - Part1
No ratings yet
Module 5 - Part1
36 pages
Khóa Học Python
No ratings yet
Khóa Học Python
391 pages
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
No ratings yet
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
22 pages
Creation and Execution Process Document
No ratings yet
Creation and Execution Process Document
4 pages
Bda Unit-Iii
No ratings yet
Bda Unit-Iii
42 pages
Lab3 BigData-MapReduce
No ratings yet
Lab3 BigData-MapReduce
8 pages
Notes Bug Data and of Apache
No ratings yet
Notes Bug Data and of Apache
9 pages
BDA Lab
No ratings yet
BDA Lab
13 pages
Exp 5 - 9
No ratings yet
Exp 5 - 9
25 pages
Oozie PDF - Collabera
No ratings yet
Oozie PDF - Collabera
11 pages
Cp5261 Da Lab Me-Cse 2021 - Edit
No ratings yet
Cp5261 Da Lab Me-Cse 2021 - Edit
88 pages
ADBMS Module4
No ratings yet
ADBMS Module4
31 pages
Interview Coding Questions Rack Space
No ratings yet
Interview Coding Questions Rack Space
5 pages
02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
BDA Record
No ratings yet
BDA Record
58 pages
Steps: /usr/lib/hadoop-0.20/ Usr/lib/hadoop-0.20/lib
No ratings yet
Steps: /usr/lib/hadoop-0.20/ Usr/lib/hadoop-0.20/lib
4 pages
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
No ratings yet
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
11 pages
Hadoop Mini Project
No ratings yet
Hadoop Mini Project
8 pages
Step 2 - First MapReduce Program
No ratings yet
Step 2 - First MapReduce Program
25 pages
BDA Manual
No ratings yet
BDA Manual
41 pages
BDF Programs
No ratings yet
BDF Programs
32 pages
Intellipaat Hands On Exercises PDF
No ratings yet
Intellipaat Hands On Exercises PDF
49 pages
DA Lab Program-2
No ratings yet
DA Lab Program-2
6 pages
Import Import Import Import Import Import Import Import Public Class Extends Implements
No ratings yet
Import Import Import Import Import Import Import Import Public Class Extends Implements
7 pages
Practical 2c
No ratings yet
Practical 2c
2 pages
Palak
No ratings yet
Palak
10 pages
BigData - Oozie
No ratings yet
BigData - Oozie
5 pages
Exp 1-2
No ratings yet
Exp 1-2
9 pages
BDT Lab Manual
No ratings yet
BDT Lab Manual
48 pages
Dsbda 11
No ratings yet
Dsbda 11
15 pages
Apache Oozie Installation Guide
No ratings yet
Apache Oozie Installation Guide
3 pages
MapReduce Enhanced Guide
No ratings yet
MapReduce Enhanced Guide
3 pages
Quick Configuration of Openldap and Kerberos in Linux and Authenicating Linux to Active Directory
From Everand
Quick Configuration of Openldap and Kerberos in Linux and Authenicating Linux to Active Directory
Dr. Hidaia Mahmood Alassouli
No ratings yet
Oozie: For Live Customized Hadoop Training (Including Prep For The Cloudera Certification Exam), Please Email
No ratings yet
Oozie: For Live Customized Hadoop Training (Including Prep For The Cloudera Certification Exam), Please Email
22 pages
How to a Developers Guide to 4k: Developer edition, #3
From Everand
How to a Developers Guide to 4k: Developer edition, #3
Xinc Cyberwizard
No ratings yet
Bda Experiment No2
No ratings yet
Bda Experiment No2
12 pages
20CSPL701 - Bda - Record 2024-2025
No ratings yet
20CSPL701 - Bda - Record 2024-2025
54 pages
Oozie: Orchestrating Your Hadoop Jobs
No ratings yet
Oozie: Orchestrating Your Hadoop Jobs
10 pages
14 Oozie
No ratings yet
14 Oozie
38 pages
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
13 pages
Hadoop
No ratings yet
Hadoop
51 pages
Hadoop Administrator Training - Lab Hand Book
No ratings yet
Hadoop Administrator Training - Lab Hand Book
12 pages
Steps To Execute MapReduce Programs
No ratings yet
Steps To Execute MapReduce Programs
1 page
Bda Manual
No ratings yet
Bda Manual
33 pages
10inceptez Oozie
No ratings yet
10inceptez Oozie
9 pages
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
No ratings yet
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
5 pages
Professional Hadoop Solutions
From Everand
Professional Hadoop Solutions
Boris Lublinsky
4/5 (2)
MapReduce Programs
No ratings yet
MapReduce Programs
10 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
37 pages
DA Lab EXERCISE
No ratings yet
DA Lab EXERCISE
24 pages
Prerequisites: Single Node Setup Cluster Setup
No ratings yet
Prerequisites: Single Node Setup Cluster Setup
5 pages
DA Lab Manual Final
No ratings yet
DA Lab Manual Final
46 pages
Install Hadoop-2.6.0 On Windows10
No ratings yet
Install Hadoop-2.6.0 On Windows10
8 pages
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
No ratings yet
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
5 pages
Word Count Example
No ratings yet
Word Count Example
4 pages
Unit IV Programming Model
No ratings yet
Unit IV Programming Model
30 pages
Running Jar Program
No ratings yet
Running Jar Program
3 pages
Ravinder Big Data 4 PDF
No ratings yet
Ravinder Big Data 4 PDF
15 pages
BDA
No ratings yet
BDA
19 pages
50 Recipes for Programming Angular
From Everand
50 Recipes for Programming Angular
Jamie Munro
4/5 (1)
Hadoop Lab Notes: Nicola Tonellotto November 15, 2010
No ratings yet
Hadoop Lab Notes: Nicola Tonellotto November 15, 2010
9 pages
3 MapReduce Program Ex Code
No ratings yet
3 MapReduce Program Ex Code
14 pages
Ex3a 2
No ratings yet
Ex3a 2
1 page
Ex 3
No ratings yet
Ex 3
3 pages
1) General Rules About Null and Option
No ratings yet
1) General Rules About Null and Option
46 pages
Ex 1
No ratings yet
Ex 1
1 page
Hadoop 3
No ratings yet
Hadoop 3
52 pages
20198488733
No ratings yet
20198488733
3 pages
Control Engineering August 2020
No ratings yet
Control Engineering August 2020
70 pages
Introduction To Structured Query Language (SQL) : Ms. Kielly Chrizza Mae T. Lara
No ratings yet
Introduction To Structured Query Language (SQL) : Ms. Kielly Chrizza Mae T. Lara
22 pages
The Complete Guide To Event-Driven Architecture - by Seetharamugn - Medium
No ratings yet
The Complete Guide To Event-Driven Architecture - by Seetharamugn - Medium
11 pages
Exploiting Honeypot For Cryptojacking The Other Side of The Story of Honeypot Deployment
No ratings yet
Exploiting Honeypot For Cryptojacking The Other Side of The Story of Honeypot Deployment
5 pages
Zahid Ali
No ratings yet
Zahid Ali
59 pages
M580-2CH User Manual 20230620
No ratings yet
M580-2CH User Manual 20230620
40 pages
Cenumes - Week 6
No ratings yet
Cenumes - Week 6
4 pages
Anshul Final Ultra
No ratings yet
Anshul Final Ultra
3 pages
Termux GUI Logcat
No ratings yet
Termux GUI Logcat
4 pages
CSP367 - 1st Day
No ratings yet
CSP367 - 1st Day
61 pages
Muhammad Salman Alfarisi, Candra Adi Bintang, Sarah Ayatillah Universitas Negeri Semarang Corresponding Author
No ratings yet
Muhammad Salman Alfarisi, Candra Adi Bintang, Sarah Ayatillah Universitas Negeri Semarang Corresponding Author
16 pages
Week 8 Data Analysis, Interpretation and Presentation
No ratings yet
Week 8 Data Analysis, Interpretation and Presentation
30 pages
Python Lab Manual
No ratings yet
Python Lab Manual
50 pages
CEN Workshop Agreement CWA 16374-34
No ratings yet
CEN Workshop Agreement CWA 16374-34
38 pages
Improving First Order Differential Power Attacks Through Digital Signal Processing
No ratings yet
Improving First Order Differential Power Attacks Through Digital Signal Processing
10 pages
Full-Stack Questions - Nonceblox
No ratings yet
Full-Stack Questions - Nonceblox
3 pages
Syllabus IAM302 - FA17
No ratings yet
Syllabus IAM302 - FA17
11 pages
Updates Notes Topsolid 'Design 6.12: Patch 6.12.228 To 6.12.232
No ratings yet
Updates Notes Topsolid 'Design 6.12: Patch 6.12.228 To 6.12.232
12 pages
Admit Card
No ratings yet
Admit Card
2 pages
Fingerprint Sensor Based Biometric Attendance System Using Arduino
No ratings yet
Fingerprint Sensor Based Biometric Attendance System Using Arduino
12 pages
Active and Passive Filter Synthesis Using MATLAB
No ratings yet
Active and Passive Filter Synthesis Using MATLAB
11 pages
Comsa 22 23 Ar
No ratings yet
Comsa 22 23 Ar
102 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
1 page
MOAC Word 2016 Expert
100% (1)
MOAC Word 2016 Expert
166 pages
Question Bank 16ee415 - PLC & Automation
100% (3)
Question Bank 16ee415 - PLC & Automation
12 pages
Frames
No ratings yet
Frames
7 pages
Uncovering The New RPC Client Access Service in Exchange 2010
No ratings yet
Uncovering The New RPC Client Access Service in Exchange 2010
47 pages

Oozie Basic Exercise

Uploaded by

Oozie Basic Exercise

Uploaded by

1.

Create a working folder in /home/hadoop

2. Extract the mapreduce examples jar

3. Find the files associated with the wordcount sample

4. Create a job.properties file

6. Create a directory on HDFS

8. copy Hadoop MapReduce examples jar under this directory.

a. hdfs folder created for program

sudo mkdir -p /user/hadoop

9. run Hadoop MapReduce program for WordCount

10. view the status of the job

11. Review the output in the directory as specified by workflow.xml

You might also like