I have a spark streaming application running inside a k8s cluster (using spark-operator).
I have 1 executor, reading batches every 5s from a Kinesis stream. The Kinesis stream has 12 shards, and the executor creates 1 receiver per shard. I gave it 24 cores, so it should be more than enough to handle it.
For some unknown reason, sometimes the executor crashes. I suspect it is due to memory going over the k8s pod memory limit, which would cause k8s to kill the pod. But I have not been able to prove this theory yet.
After the executor crashes, a new one is created.
However, the "work" stops. The new executor is not doing anything.
I investigated a bit:
Looking at the logs of the pod - I saw that it did execute a few tasks successfully after it was created, and then it stopped because it did not get any more tasks.
Looking in Spark Web UI - I see that there is 1 “running batch” that is not finishing.
I found some docs that say there can always be only 1 active batch at a time. So this is why the work stopped - it is waiting for this batch to finish.
Digging a bit deeper in the UI, I found this page that shows details about the tasks.
So executor 2 finished doing all the tasks it was assigned. There are 12 tasks that were assigned to executor 1 which are still waiting to finish, but executor 1 is dead.
Why does this happen? Why does Spark not know that executor 1 is dead and never going to finish it's assigned tasks? I would expect Spark to reassign these tasks to executor 2.