I am running FastAPI + Uvicorn server setup on EC2 instance. I am using 4 Uvicorn workers and 4 celery workers on 8vCPU Ec2 instance. I am performing very simple task. I get data on post API, schedule background task using celery and return response immediately. For this simple task if I send 40 concurrent request at sec, I get p99 as 600 ms which is very high. It should be within two digit. What could be going wrong? BackgroundTasks which is built-in for FastAPI wont be of any help here because I have cpu bound task which takes time to execute. I have already tried it.
from fastapi import FastAPI
from celeryy import Celery
celery_app = Celery("tasks", broker="redis://localhost:6379/0", backend="redis://localhost:6379/0")
app = FastAPI(lifespan=lifespan)
def generate_model_score(request_data):
# Calculate ML model score
pass
@app.post("/predict/")
async def sample_fn(request_data: dict):
if total_transactions == 0:
logger.info("user doesnt have any data")
return default_response(request_data['userId'])
generate_model_score.delay(request_data)
logger.debug("end of predict")
# Respond with dummy data immediately
return default_response(request_data['userId'])