I had a small doubt to clarify as an expert advice Can I run dataproc (pyspark job) an an image on cloud run (service or job) as dataproc job may take few minutes to hours to complete so it will asynchronous orchestration and it can be batch or event both either trigger by rest api or by a event from pubsub
Hope you understand my query,
Options (1 and 2 point is more costly compared to 3rd - I feel)
- Run dataproc job using composer (composer will be on during idle state with autoscaling also)
- Run on gke autopilot (again it will be custom image of spark job)
- Run on cloud run (service if finish within 60 mins or as a job if takes few hours) will charge based on number of requests.
Requirement is dataproc spark jobs (workflow-ephemeral) will run demand basis ( it may run based on request per day) hence thinking to use cloud run
In cloudrun jobs can be executed using workflow dependencies/batch scheduling on scheduler/for event can use eventarc
Best architecture decision