FeatureStoreClient() log_model failing to run inference with mlflow.spark flavor

187 views Asked by At

I am logging the model using the FeatureStoreClient().log_model(...,flavor = mlflow.spark,...), and after I try to run inference with the model using

fs.score_batch(f"models:/{model_name}/Production", batch_scoring)

in a databricks runtime environment.

I get the following error after running the batch inference:

2023/11/17 20:51:40 WARNING mlflow.pyfunc: Detected one or more mismatches between the model's dependencies and the current Python environment:
 - databricks-feature-lookup (current: uninstalled, required: databricks-feature-lookup==0.*)
To fix the mismatches, call `mlflow.pyfunc.get_model_dependencies(model_uri)` to fetch the model's environment and install dependencies using the resulting environment file.
2023/11/17 20:51:40 WARNING mlflow.pyfunc: Calling `spark_udf()` with `env_manager="local"` does not recreate the same environment that was used during training, which may lead to errors or inaccurate predictions. We recommend specifying `env_manager="conda"`, which automatically recreates the environment that was used to train the model and performs inference in the recreated environment.
PythonException: 'ModuleNotFoundError: No module named 'ml''. Full traceback below:

I do not know how could this be fixed, any help would be appreciated.

I have been... -Trying different flavors in the log_model -Trying to understand better the mlflow model flavors API

2

There are 2 answers

0
Mancho Kiria On

It seems that model.predict can not find ml module. You can check your code, maybe you are using custom module ml.py (or package), simple search for imports. Custom code must be included in fs.log_model

0
Mojgan Mazouchi On

The error "'ModuleNotFoundError: No module named 'ml'" suggests the ml module you want to use and its artifacts are not accessible by score_batch() API. It is likely that you did not specify the code_path() correctly when you logged your model using fs.log_model(). If you are using a custom module and defined it in a separate notebook from the notebook you are using for inference, make sure to pass the path to that custom module in code_path() argument of log_model() API. I added a code example in this repo.