Google vertex endpoint is unavilable when deploying new model

91 views Asked by CutePoison At 27 October 2023 at 08:54

Every night we train a new model, and deploy it (in Flaks) to an existing endpoint. The Flask code is (simplified) as:

from flask import Flask, request

def load_models()
   """
    Load new model from google cloud storage
   """
   .
   .
   return model
app = Flask(__name__)
app.json.ensure_ascii = False
MODELS_ARE_LOADED = False
model = load_models()
MODELS_ARE_LOADED = True



@app.route('/predict', methods=["POST"])
def predict():
     data = request.get_data()
     predictions = model.predict(data)

     return {"predictions":predictions}


@app.route('/health', methods=["GET"])
def health():
    if not MODELS_ARE_LOADED:
        raise BadRequest("Models are not loaded yet")
    return "OK"

which works fine. The issue is that it seems like the old model is removed before the new model is ready, leading to the endpoint being unavailable for a few minutes.

The check MODELS_ARE_LOADED seems to work locally i.e returns the error message when the models are not loaded, and as far as I understand the endpoint is not considered "ready" before it's healthy.

I would assume that adding a new model the traffic wouldn't go to the new model before it is healthy, or am I wrong here?

Original Q&A

TechQA.

Google vertex endpoint is unavilable when deploying new model

There are 0 answers

Related Questions in PYTHON

Related Questions in GOOGLE-CLOUD-PLATFORM

Related Questions in GOOGLE-CLOUD-VERTEX-AI

Popular Questions

Popular Tags

Trending Questions