I have built a mar file and a container to launch it with this command
CMD ["torchserve", "--start", "--ncs", "--ts-config", "/home/config.properties", "--model-store", "/home/model-store/", "--models", "mymodel=my_model.mar"]
. And I expose the port 8081
as specified in my config.properties
.
Now if run the container locally, I'm able to communicate with it using this url="http://localhost:8081/v1/models/my_model:predict"
.
and I upload my model using this predict route:
model = aiplatform.Model.upload(
display_name=model_display_name,
description=model_description,
serving_container_image_uri=CUSTOM_PREDICTOR_IMAGE_URI,
serving_container_predict_route="/v1/models/my_model:predict",
serving_container_health_route=health_route,
serving_container_ports=serving_container_ports,
)
However, the problem comes when i deploy the endpoint, the model is loaded and ready, i see the health ping with a 200
but when i try to send a prediction request I get this response:
{
"code": 404,
"type": "ModelNotFoundException",
"message": "Model not found: my_model"
}
But I can see that there is a log:
{"@type":"type.googleapis.com/google.cloud.aiplatform.logging.OnlinePredictionLogEntry", "deployedModelId":"***", "endpoint":"projects/***/locations/***/endpoints/***", "error":{…}, "instanceCount":"1"}
If i try from the sample request I get 404`
The operation failed due to the following error(s):
Invalid HTTP message.
What should I put as predict route?