On occasion I have code location failing to load or crashing due to a bug in my code.
Obviously I need to resolve the bug, but also, I would be able to alert on the error.
I noticed dagster itself is able to detect issues with code location and I am wondering if there are ways to extract these metrics, possibly through a prometheus endpoint or other mechanism.
I noticed dagster itself is able to detect this (failing code location, see image), but I have not been able to monitor on this event.
I read the docs on hooks and prometheus but these are all on Job
level.
Is there a way to gather these statistics on deamon or orchestrator level?