I am using spark-operator for Kubernetes.
https://github.com/GoogleCloudPlatform/spark-on-k8s-operator
I am able to run the jobs successfully but there is a lot of trouble for monitoring and troubleshooting the job since the pods are dynamic.
I want to know the best possible way to enable the history server(to s3a://<bucket>) along with spark-operator.
Also, how can I store the stdout and stderr logs of driver/executor for each job in s3a://<bucket>.
I think using
filebeat
to collect the logs of the pods and save the logs in elasticsearch is a good practice.