Error logging Spark model with MLflow on Databricks - mlflow.spark.log_model()

116 views Asked by At

I am attempting to log a Spark model using the code snippet below. The model metrics and parameters are saved in the ML flow run, but the model itself does not get saved under artefacts. However, when logging a Scikit-learn model using model.sklearn.log_model() in the same environment, the model gets successfully saved.

Environment: Databricks 10.4 LTS ML cluster

train, test = train_test_random_split(conf, data)

experiment_name = "/mlflow_experiments/debug_spark_model"
mlflow.set_experiment(experiment_name)


evaluator = BinaryClassificationEvaluator()

rf = RandomForestClassifier()

param_grid = (
    ParamGridBuilder()
    .addGrid(rf.numTrees,[15)
    .addGrid(rf.maxDepth, [6])
    .addGrid(
        rf.minInstancesPerNode,
       [7],
    )
    .build()
)

cv = CrossValidator(
    estimator=rf,
    estimatorParamMaps=param_grid,
    evaluator=BinaryClassificationEvaluator(metricName="areaUnderROC"),
    numFolds=10,
)
cv_model = cv.fit(train)

# best model
model = cv_model.bestModel

model_params_best = {
    "numTrees": cv_model.getEstimatorParamMaps()[np.argmax(cv_model.avgMetrics)][
        cv_model.bestModel.numTrees
    ],
    "maxDepth": cv_model.getEstimatorParamMaps()[np.argmax(cv_model.avgMetrics)][
        cv_model.bestModel.maxDepth
    ],
    "minInstancesPerNode": cv_model.getEstimatorParamMaps()[
        np.argmax(cv_model.avgMetrics)
    ][cv_model.bestModel.minInstancesPerNode],
}

model_metrics_best, artifacts_best, predicted_df_best = train_model(
    model, train, test, evaluator
)
with mlflow.start_run(run_name="debug_run_1"):
    run_id = mlflow.active_run().info.run_id
    mlflow.log_params(model_params_best)
    mlflow.log_metrics(model_metrics_best)

    #debug 1
    artifact_path = "best_model"
    mlflow.spark.log_model(spark_model = model, artifact_path = artifact_path) 
    source = get_artifact_uri(run_id=run_id, artifact_path=artifact_path)

It gives the below error.

com.databricks.mlflowdbfs.MlflowHttpException: statusCode=404 reasonPhrase=[Not Found] bodyMessage=[{"error_code":"RESOURCE_DOES_NOT_EXIST","message":"Run 'bfe90fd5074f49c39a475b613d020cbf' not found."}]

enter image description here

I appreciate any direction for debugging or a solution regarding this error.

1

There are 1 answers

1
Kavishka Gamage On

Found a workaround for this error or most of errors related to mlflowdbfs.

Disabling use of mlflowdbfs in Databricks ML Runtime cluster worked for above error. Another option will be using a normal Databricks Runtime cluster.

import os
os.environ["DISABLE_MLFLOWDBFS"] = "true"