Is something like this possible in oozie?
hive -f hiveScript.hql > output.txt
I have the following oozie hive action for the above code as follows:
<hive xmlns="uri:oozie:hive-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<script>hiveScript.hql</script>
</hive>
<ok to="end" />
<error to="kill" />
</hive>
How can I tell the script where the output should go?
That is not possible with Oozie in the way that you want. This is because Oozie starts (most) of it's workflow actions on nodes within the cluster.
With this you could run the Oozie Shell action to run
hive -f hiveScript.hql > output.txt
... however this has different implications of requiring Hive to be installed everywhere, your hiveScript.hql to be everywhere, etc. Another way this doesn't quite work is your output file would be on whichever node was assigned to run this shell action. https://oozie.apache.org/docs/3.3.0/DG_ShellActionExtension.htmlI think you best bet would be to include
INSERT OVERWRITE DIRECTORY '/tmp/hdfs_out' SELECT * FROM ...
in your hiveScript.hql file and pulling the results down from HDFS afterwards.Edit: Another option I just thought of would be to use the SSH Action. https://oozie.apache.org/docs/3.2.0-incubating/DG_SshActionExtension.html You could potentially have the SSH Action shell to your target machine and run
hive -f hiveScript.hql > output.txt
.