I have been tasked with updating kserve from 0.7 to 0.9. Our company mar files run fine on 0.7 but when I update to kserve 0.9 the pods are brought up without issue. However, when I when a request is sent it returns a 500 error. The logs are given below.
model being used is: pytorch Deployment type: RawDeployment Kubernetes version: 1.25
Defaulted container "kserve-container" out of: kserve-container, storage-initializer (init)
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2022-11-18T13:37:44,001 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
2022-11-18T13:37:44,203 [INFO ] main org.pytorch.serve.ModelServer -
Torchserve version: 0.6.0
TS Home: /usr/local/lib/python3.8/dist-packages
Current directory: /home/model-server
Temp directory: /home/model-server/tmp
Number of GPUs: 0
Number of CPUs: 1
Max heap size: 494 M
Python executable: /usr/bin/python
Config file: /mnt/models/config/config.properties
Inference address: http://0.0.0.0:8085
Management address: http://0.0.0.0:8085
Metrics address: http://0.0.0.0:8082
Model Store: /mnt/models/model-store
Initial Models: N/A
Log dir: /home/model-server/logs
Metrics dir: /home/model-server/logs
Netty threads: 4
Netty client threads: 0
Default workers per model: 1
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Limit Maximum Image Pixels: true
Prefer direct buffer: false
Allowed Urls: [file://.*|http(s)?://.*]
Custom python dependency for model allowed: true
Metrics report format: prometheus
Enable metrics API: true
Workflow Store: /mnt/models/model-store
Model config: N/A
2022-11-18T13:37:44,208 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Loading snapshot serializer plugin...
2022-11-18T13:37:44,288 [INFO ] main org.pytorch.serve.snapshot.SnapshotManager - Started restoring
2022-11-18T13:37:44,297 [INFO ] main org.pytorch.serve.snapshot.SnapshotManager - Validating snapshot startup.cfg
2022-11-18T13:37:44,298 [INFO ] main org.pytorch.serve.snapshot.SnapshotManager - Snapshot startup.cfg validated successfully
[I 221118 13:37:46 __main__:75] Wrapper : Model names ['modelname'], inference address http//0.0.0.0:8085, management address http://0.0.0.0:8085, model store /mnt/models/model-store
[I 221118 13:37:46 TorchserveModel:54] kfmodel Predict URL set to 0.0.0.0:8085
[I 221118 13:37:46 TorchserveModel:56] kfmodel Explain URL set to 0.0.0.0:8085
[I 221118 13:37:46 TSModelRepository:30] TSModelRepo is initialized
[I 221118 13:37:46 model_server:150] Registering model: modelname
[I 221118 13:37:46 model_server:123] Listening on port 8080
[I 221118 13:37:46 model_server:125] Will fork 1 workers
[I 221118 13:37:46 model_server:128] Setting max asyncio worker threads as 12
2022-11-18T13:37:54,738 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model modelname
2022-11-18T13:37:54,738 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model modelname
[I 221118 13:40:12 TorchserveModel:78] PREDICTOR_HOST : 0.0.0.0:8085
[E 221118 13:40:12 web:1789] Uncaught exception POST /v1/models/modelname:predict (127.0.0.1)
HTTPServerRequest(protocol='http', host='localhost:5000', method='POST', uri='/v1/models/modelname:predict', version='HTTP/1.1', remote_ip='127.0.0.1')
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/tornado/web.py", line 1704, in _execute
result = await result
File "/usr/local/lib/python3.8/dist-packages/kserve/handlers/predict.py", line 70, in post
response = await model(body)
File "/usr/local/lib/python3.8/dist-packages/kserve/model.py", line 86, in __call__
response = (await self.predict(request)) if inspect.iscoroutinefunction(self.predict) \
File "/home/model-server/kserve_wrapper/TorchserveModel.py", line 80, in predict
response = await self._http_client.fetch(
ConnectionRefusedError: [Errno 111] Connection refused
[E 221118 13:40:12 web:2239] 500 POST /v1/models/modelname:predict (127.0.0.1) 9.66ms
[I 221118 13:40:13 TorchserveModel:78] PREDICTOR_HOST : 0.0.0.0:8085
[E 221118 13:40:13 web:1789] Uncaught exception POST /v1/models/modelname:predict (127.0.0.1)
HTTPServerRequest(protocol='http', host='localhost:5000', method='POST', uri='/v1/models/modelname:predict', version='HTTP/1.1', remote_ip='127.0.0.1')
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/tornado/web.py", line 1704, in _execute
result = await result
File "/usr/local/lib/python3.8/dist-packages/kserve/handlers/predict.py", line 70, in post
response = await model(body)
File "/usr/local/lib/python3.8/dist-packages/kserve/model.py", line 86, in __call__
response = (await self.predict(request)) if inspect.iscoroutinefunction(self.predict) \
File "/home/model-server/kserve_wrapper/TorchserveModel.py", line 80, in predict
response = await self._http_client.fetch(
ConnectionRefusedError: [Errno 111] Connection refused
[E 221118 13:40:13 web:2239] 500 POST /v1/models/modelname:predict (127.0.0.1) 3.31ms
[I 221118 13:40:14 TorchserveModel:78] PREDICTOR_HOST : 0.0.0.0:8085
[E 221118 13:40:14 web:1789] Uncaught exception POST /v1/models/modelname:predict (127.0.0.1)
HTTPServerRequest(protocol='http', host='localhost:5000', method='POST', uri='/v1/models/modelname:predict', version='HTTP/1.1', remote_ip='127.0.0.1')
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/tornado/web.py", line 1704, in _execute
result = await result
File "/usr/local/lib/python3.8/dist-packages/kserve/handlers/predict.py", line 70, in post
response = await model(body)
File "/usr/local/lib/python3.8/dist-packages/kserve/model.py", line 86, in __call__
response = (await self.predict(request)) if inspect.iscoroutinefunction(self.predict) \
File "/home/model-server/kserve_wrapper/TorchserveModel.py", line 80, in predict
response = await self._http_client.fetch(
ConnectionRefusedError: [Errno 111] Connection refused
[E 221118 13:40:14 web:2239] 500 POST /v1/models/modelname:predict (127.0.0.1) 3.38ms
I was not able to find the package (/usr/local/lib/python3.8/dist-packages/tornado/web.py) tornado inside the mar file. So I don't think it is being used directly by the model.
I tried deploying it on both kserver 0.7 and 0.9. our mar file works on kserve 0.7 but fails on kserve 0.9. I also deployed the sample inference (https://kserve.github.io/website/0.9/modelserving/v1beta1/torchserve/#create-the-torchserve-inferenceservice) on kserve 0.9 and it worked as expected.
deployed it on GKE, rke2 and Docker Desktop Kubernetes.