0% found this document useful (0 votes)
16 views3 pages

Oozie Commands

Uploaded by

chandhu194
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views3 pages

Oozie Commands

Uploaded by

chandhu194
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 3

Prerequisites:

==============
sudo -u hdfs hdfs dfs -mkdir /user/hdpuser
sudo -u hdfs hdfs dfs -mkdir /user/hdpuser/oozie
sudo -u hdfs hdfs dfs -chown -R hdpuser:hdfs /user/hdpuser
sudo -u hdfs hdfs dfs -chmod -R 770 /apps

sudo -u hdfs hdfs dfs -mkdir /project1


sudo -u hdfs hdfs dfs -mkdir /project1/oozie
sudo -u hdfs hdfs dfs -mkdir /project1/data
sudo -u hdfs hdfs dfs -chmod -R 770 /project1
sudo -u hdfs hdfs dfs -chown -R hdpuser:hdfs /project1

If you enable NFS Gateway, use below commands. Otherwise ignore this section:
=============================================================================
sudo mkdir /hdfs
sudo umount -l /hdfs
sudo mount -t nfs -o
vers=3,proto=tcp,nolock,sync,rsize=1048576,wsize=1048576,noatime localhost:/ /hdfs

sudo mkdir /hdfs/user/hdpuser


sudo mkdir /hdfs/user/hdpuser/oozie
sudo chown -R hdpuser:hdfs /hdfs/user/hdpuser
sudo chmod -R 777 /hdfs/apps

sudo mkdir /hdfs/project1


sudo mkdir /hdfs/project1/oozie
sudo mkdir /hdfs/project1/data
sudo chmod -R 770 /hdfs/project1
sudo chown -R hdpuser:hdfs /hdfs/project1

In Oozie Node:
==============
hdfs dfs -ls /user/oozie/share/lib/
sudo -u oozie hdfs dfs -put /usr/share/java/mysql-connector-java.jar
/user/oozie/share/lib/lib_20181226065713/sqoop/
sudo -u oozie oozie admin -oozie http://hn3.hadoop.com:11000/oozie/ -sharelibupdate
export OOZIE_URL="http://hn3.hadoop.com:11000/oozie/"

Submit wf_sales_data Workflow:


==============================
sudo chown -R hdpuser:hdpadmin /home/hdpuser/wf_sales_data
hdfs dfs -rm -r -skipTrash /project1/oozie/wf_sales_data
hdfs dfs -put /home/hdpuser/wf_sales_data /project1/oozie/
cd /home/hdpuser/wf_sales_data
oozie job -config job.properties -run
oozie job -info

hdfs dfs -ls /project1/data/sales_data_dump

Submit coord_sqoop_sales_data Coordinator Without Dependencies:


===============================================================

sudo chown -R hdpuser:hdpadmin /home/hdpuser/coord_sqoop_sales_data


hdfs dfs -rm -r -skipTrash /project1/oozie/coord_sqoop_sales_data
hdfs dfs -put /home/hdpuser/coord_sqoop_sales_data /project1/oozie/
cd /home/hdpuser/coord_sqoop_sales_data
oozie job -config job.coordinator.properties -run
oozie job -info

hdfs dfs -ls /project1/data/sales_data_raw/

Submit coord_pig_sales_data Coordinator With Dependencies:


==========================================================

sudo chown -R hdpuser:hdpadmin /home/hdpuser/coord_pig_sales_data


hdfs dfs -rm -r -skipTrash /project1/oozie/coord_pig_sales_data
hdfs dfs -put /home/hdpuser/coord_pig_sales_data /project1/oozie/
cd /home/hdpuser/coord_pig_sales_data
oozie job -config job.coordinator.properties -run
oozie job -info

hdfs dfs -ls /project1/data/sales_data_pig_output/

Submit Sample Coordinator With Dependencies:


============================================
sudo chown -R hdpuser:hdpadmin /home/hdpuser/coord_hive_sales_data
hdfs dfs -rm -r -skipTrash /project1/oozie/coord_hive_sales_data
hdfs dfs -put /home/hdpuser/coord_hive_sales_data /project1/oozie/
cd /home/hdpuser/coord_hive_sales_data
oozie job -config job.coordinator.properties -run
oozie job -info

Submit Sample Coordinator With Dependencies:


============================================

hadoop fs -rm -r /user/cloudera/oozie/partitionload


hadoop fs -put /opt/examples/oozie/partitionload /user/cloudera/oozie/
cd /opt/examples/oozie/partitionload
oozie job -oozie http://quickstart.cloudera:11000/oozie/ -config
job.coordinator.properties -run

Submit Sample TDUMP Workflow:


============================================

hadoop fs -rm -r /user/cloudera/oozie/wfdump


hadoop fs -put /opt/examples/oozie/wfdump /user/cloudera/oozie/
cd /opt/examples/oozie/wfdump
oozie job -oozie http://quickstart.cloudera:11000/oozie/ -config
job.coordinator.properties -run

Check status of Workflow / Coordinator:


=======================================

oozie job -oozie http://quickstart.cloudera:11000/oozie/ -info

Kill Workflow / Coordinator:


============================

oozie job -oozie http://quickstart.cloudera:11000/oozie/ -kill 0000044-


160324191529712-oozie-oozi-C

oozie job -oozie http://quickstart.cloudera:11000/oozie/ 0000000-160324191529712-


oozie-oozi-C -concurrency 5

oozie job -oozie http://quickstart.cloudera:11000/oozie/ -suspend 0000000-


160324191529712-oozie-oozi-C

oozie job -oozie http://quickstart.cloudera:11000/oozie/ -suspend 0000029-


160324191529712-oozie-oozi-C

for d in * ; do nohup sudo gzip $d & _; done

du -sch /data/mysqlbakups/
df -h | grep /data

Timeout: A coordinator job can specify the timeout for its coordinator actions,
this is, how long the coordinator action will be in WAITING or READY status before
giving up on its execution.
Concurrency: A coordinator job can specify the concurrency for its coordinator
actions, this is, how many coordinator actions are allowed to run concurrently
( RUNNING status) before the coordinator engine starts throttling them.
Execution strategy: A coordinator job can specify the execution strategy of its
coordinator actions when there is backlog of coordinator actions in the coordinator
engine. The different execution strategies are 'oldest first', 'newest first' and
'last one only'. A backlog normally happens because of delayed input data,
concurrency control or because manual re-runs of coordinator jobs.
Throttle: A coordinator job can specify the materialization or creation throttle
value for its coordinator actions, this is, how many maximum coordinator actions
are allowed to be in WAITING state concurrently.

You might also like