Oozie: Orchestrating Your Hadoop Jobs
Oozie: Orchestrating Your Hadoop Jobs
pig
Join and filter results
sqoop
Extract movie titles
<?xml version="1.0" encoding="UTF-8"?>
<workflow-app xmlns="uri:oozie:workflow:0.2" name=“top-movies">
<start to="fork-node"/>
<action name="sqoop-node">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
</sqoop>
<ok to="joining"/>
<error to="fail"/>
</action>
<action name="hive-node">
<hive xmlns="uri:oozie:hive-action:0.2">
</hive>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Sqoop failed, error
message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
Steps to set up a workflow in Oozie
nameNode=hdfs://sandbox.hortonworks.com:8020
jobTracker=http://sandbox.hortonworks.com:8050
queueName=default
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/maria_dev
Running a workflow with Oozie
<coordinator-app xmlns = "uri:oozie:coordinator:0.2" name = “sample coordinator" frequency = "5 * * * *" start = "2016-00-18T01:00Z" end = "2025-12-31T00:00Z"" timezone = "America/Los_Angeles">
<controls>
<timeout>1</timeout>
<concurrency>1</concurrency>
<execution>FIFO</execution>
<throttle>1</throttle>
</controls>
<action>
<workflow>
<app-path>pathof_workflow_xml/workflow.xml</app-path>
</workflow>
</action>
</coordinator-app>
Oozie bundles