I am performing some operation using DataProcPySparkOperator
. This operator is only taking a cluster name as parameter, there is no option to specify region and by default it considers cluster with global region.
For clusters with regions other than global, the following error occurs:
googleapiclient.errors.HttpError: https://dataproc.googleapis.com/v1/projects//regions/global/jobs:submit?alt=json returned "No current cluster for project id '' with name ''`
Am i missing anything or its just limitation with these operators?
These DataProc{PySpark|Spark|Hive|Hadoop|..}Operators simply don't support region argument today, an airflow issue has been created and I'll submit a fix in the next few days.