Spark Driver stopped when calling a notebook multiple times

100 views Asked by At

The Problem: I have a databricks notebook ( pthon/pyspark). I need to call another databricks notebook to perform some validations (Great Expectations) inside a for loop so the code looks like below in notebook 1:

For d in mylist:
    result = dbutils.notebook.run(
                "./checkdata/check_data",
                5000,
                params
            )

This process runs successfully for a limited number of calls (around 70-80). But as soon as it crosses the limit (80 iterations) i start getting the error

"Spark driver has stopped unexpectedly and is restarting. your notebook will be reattached"

when i run the notebook again, after first few iterations (eg 3-4 iterations) the message would appear again (which apparently appears that the memory is full already from the previous run)

Once i restart the cluster it runs for 70-80 iterations again. But then breaks.

I have already gone through one of the solutions provided here Spark driver stopped unexpectedly

but that doesn't solve my problem.

I have no other choice than calling the notebook, and passing the parameters. So sending the parameters to the notebook and running the loop inside the other notebook is not possible in this scenario.

Any advice on how to fix this issue please?

0

There are 0 answers