I can't seem to deploy a model-serving endpoint in DataBricks from a finetuned model in DBFS

242 views Asked by At

I have finetuned a GPT-2 model within databricks and saved the model and tokenizer into DBFS. I am trying to log, register, and deploy that model to an endpoint, which I am not sure is currently possible at the moment. I have followed along with the examples provided here, here, here, here, and here, and while the examples work for me, I haven't been able to swap out the foundation models used with a finetuned model of my own. Here is some code that very closely approximates my code for logging and registering the model:

# Define PythonModel to log with mlflow.pyfunc.log_model
class GPT(mlflow.pyfunc.PythonModel):
    def load_context(self, context):
        This method initializes the tokenizer and language model.
        self.tokenizer = GPT2Tokenizer.from_pretrained(snapshot_location)

        config = GPT2Config.from_pretrained(snapshot_location)

        self.model = GPT2LMHeadModel.from_pretrained(snapshot_location, config=config)



    def predict(self, model_input):
        This method generates prediction for the given input.
        generated_text = []
        for index, row in model_input.iterrows():
            prompt = row["prompt"]
            temperature=model_input.get("temperature", [0.7])[0]
            max_new_tokens=model_input.get("max_new_tokens", [100])[0]
            full_prompt = prompt
            encoded_input = self.tokenizer.encode(full_prompt, return_tensors="pt").to('cuda')
            output = self.model.generate(encoded_input, 
            prompt_length = len(encoded_input[0])
            generated_text.append(self.tokenizer.batch_decode(output[:,prompt_length:], skip_special_tokens=True))
        return pd.Series(generated_text)

# Define input and output schema
input_schema = Schema([
    ColSpec(DataType.string, "prompt"), 
    ColSpec(DataType.double, "temperature", optional=False), 
    ColSpec(DataType.integer, "max_new_tokens", optional=False)
output_schema = Schema([ColSpec(DataType.string, 'output')])
signature = ModelSignature(inputs=input_schema, outputs=output_schema)

# Define input example
            "temperature": [0.7],
            "max_new_tokens": [100]

# Log the model with its details such as artifacts, pip requirements and input example
torch_version = torch.__version__.split("+")[0]
with mlflow.start_run() as run:  
        artifacts={'repository' : snapshot_location},
                          f"accelerate=={accelerate.__version__}", "einops", "sentencepiece"],


registered_name = "models.default.gpt" # Note that the UC model name follows the pattern <catalog_name>.<schema_name>.<model_name>, corresponding to the catalog, schema, and registered model name

# Register model in MLflow Model Registry
result = mlflow.register_model(

client = MlflowClient(registry_uri="databricks-uc")

# Choose the right model version registered in the above cell.
client.set_registered_model_alias(name=registered_name, alias="GPT_model", version=result.version)

I can sucessfully log and register the model just fine, the issues arise when I try to prompt the model within a notebook or set up a model serving endpoint. As you can tell, I have not deviated much from the provided tutorials, but I believe everything hinges on where the snapshot_location variable points to. I have tried pointing it at a dbfs file path, only to get the error while trying to serve the model An error occurred while loading the model. Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/dbfs/$FILEPATH'. Use 'repo_type' argument if needed.. in the service logs. I then tried the code:

snapshot_location = os.path.expanduser("~/gpt/output")
os.makedirs(snapshot_location, exist_ok=True)

which gave me the error HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/root/gpt/output'. Use 'repo_type' argument if needed. while trying to prompt the model within a notebook.

My question is, is there a way to somehow get the repo_type flag somewhere in there? If I can't save the model in DBFS, should I be saving the model in an S3 bucket? Googling the error led me to a lot of github comments about including the full file path, but I am not sure how applicable that is to my problem. I am going to try to move the files from DBFS onto the local disk, and try and register the model from there. Am I going about this the right way though? Is the issue not with the location but with my logging and registration code? Any insight on this would be greatly appreciated.


There are 0 answers