I'm running a Spark 3.0 application (Spark Structured Streaming) on Kubernetes and I'm trying to use the new native Prometheus metric sink. I'm able to make it work and get all the metrics described here.
However, the metrics I really need are the ones provided upon enabling the following config: spark.sql.streaming.metricsEnabled, as proposed in this Spark Summit presentation. Now, even with that config set to "true", I can't see any streaming metrics under /metrics/executors/prometheus as advertised. One thing to note is that I can see them under metrics/json, therefore, we know that the configuration was properly applied.
Why aren't streaming metrics sent to the Prometheus sink? Do I need to add some additional configuration? Is that not supported yet?
After quite a bit of investigation, I was able to make it work. In short, the Spark job k8s definition file needed one additional line, to tell spark where to find the
metrics.propretiesconfig file.Make sure to add the following line under
sparkConfin the Spark job k8s definition file, and adjust it to your actual path. The path to themetrics.propertiesfile should be set in your Dockerfile.For reference, here's the rest of my
sparkConf, for metric-related config.