Encountering 503 Error When Calling Gemini API from Google Colab

529 views Asked by At

I'm working on a project using Google Colab to run Python code that interacts with the Gemini API (a part of Google's Cloud AI tools). The goal is to automate call transcript categorization into predefined categories using Gemini's AI.

Here's a brief overview of what I'm doing: I read an Excel file of call transcripts, send these transcripts to Gemini for categorization, and then update the Excel file based on the categories identified by the AI (marking them with 0s and 1s).

Below is a snippet of my code for setting up the API and sending a request to Gemini:

import google.generativeai as genai

GOOGLE_API_KEY = "your_api_key_here"
genai.configure(api_key=GOOGLE_API_KEY)
model = genai.GenerativeModel('gemini-pro')

def send_to_gemini(transcript):
    prompt = f"Categorize the following transcript: {transcript}"
    try:
        response = model.generate_content(prompt)
        return response.text
    except Exception as e:
        print(f"Failed to send request to Gemini: {e}")

However, I keep getting an ERROR:tornado.access:503 suggesting a server-side issue:

ERROR:tornado.access:503 POST /v1beta/models/gemini-pro:generateContent (127.0.0.1) 4039.47ms

Any advice or insights would be greatly appreciated.

1

There are 1 answers

2
Prisoner On

Error 503 corresponds to "service unavailable".

If you read the full error message on the response, you might get "The model is overloaded. Please try again later."

This is definitely something going on on Google's end - not yours. There is very little you can do about it. However, you should certainly account for it. Retrying attempts in a loop with progressive back-off is a standard approach to server unavailable messages.

You indicate you're using the "google.generativeai" package, which corresponds with the AI Studio offering (aistudio.google.com) - not Google Cloud. This is currently a free-tier only service that is not Generally Available. So it seems likely they are still tuning things to meet expected user demand and scale things out as they get ready for production and for the introduction of an additional paid tier.