I'm trying to schedule an Oozie job that runs daily. From the shell, I'm using this command–
oozie job -oozie $OOZIE_URL -run -verbose \
-config $PWD/this_file_is_a_formality.properties \
-Doozie.coord.application.path="hdfs:///path/to/file/aggregates_workflow.xml" \
-Dstart="$START" \
-Dend="$END"
(Assume all the environment variables are correctly set.)
I'm getting this error:
Error: E0701 : E0701: XML schema error, cvc-elt.1.a: Cannot find the declaration of element 'workflow-app'.
I believe Oozie is parsing my workflow XML file, but it's not correctly recognizing the valid XML in the file.
The aggregates_workflow.xml
file:
<workflow-app xmlns="uri:oozie:workflow:0.5" name='PREAGGREGATED'>
<global>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
...
</property>
</configuration>
</global>
<start to="spark-node"/>
<action name="spark-node">
<spark xmlns="uri:oozie:spark-action:0.1">
<job-tracker>yarnRM</job-tracker>
<name-node>PREAGGREGATED</name-node>
<configuration>
<property> ...
</property>
</configuration>
<master>yarn-client</master>
<mode>client</mode>
<name>${appName}</name>
<class>${className}</class>
<jar>${jarPath}</jar>
<spark-opts>...0</spark-opts>
</spark>
<ok to="end"/>
<error to="fail"/>
</action>
I'd appreciate a diagnosis–any idea why this wouldn't work?
Really simple mistake–
is the proper start command–the third line was previously pointing to the workflow.