Triton inference server: Explicit model control

881 views Asked by Buddhi De Seram At 14 October 2021 at 17:55

I need a little advice with deploying Triton inference server with explicit model control. From the looks of it, this mode gives the user the most control to which model goes live. But the problem I’m not able to solve is how to load models in case the server goes down in production which triggers a new instance to spawn up.

The only solution I can think of is to have a service poll the server at regular time intervals, constantly check if my live models are actually live and if not, load them. But this seems like quite a complicated process.

I would like to know how others have solved this problem.

Thanks in advance

Original Q&A

TechQA.

Triton inference server: Explicit model control

There are 0 answers

Related Questions in MLOPS

Related Questions in TRITONSERVER

Popular Questions

Popular Tags

Trending Questions