Why Vertex API dosn't let me get a multiples responses?

46 views Asked by At

I have a typescript function to get a response of the Vertex API, i'm using the new model gemini pro to send a message from the client to the API. But, i'm only getting one response into the for bucle, and that response is incompleted. In the next example, i use prisma to save the chat history.

const projectId = process.env.PROJECT_ID
const location = 'us-central1'

const publisher = 'google'
const model = 'gemini-pro'
const vertexAI = new VertexAI({ project: projectId as string, location: location })
const generativeModel = vertexAI.preview.getGenerativeModel({
    model: model,
    generation_config: {
        max_output_tokens: 2048,
        temperature: 0.9,
        top_p: 1,
    },
})

const chat = generativeModel.startChat({})

export async function callPredict2(chatId: string, message: string) {
    let history = (await prisma.chat.findUnique({
        where: {
            id: chatId,
        },
    })) as IChat

    if (history === null) {
        history = await prisma.chat.create({
            data: {
                id: chatId,
            },
        })
        prisma.$disconnect()
    }

    try {
        const result1 = await chat.sendMessageStream(message)
        const finalResponse = []

        for await (const item of result1.stream) {
            finalResponse.push(item.candidates[0].content.parts[0].text)
        }

        const msg = await prisma.message.create({
            data: {
                chatId: chatId,
                role: 'bot',
                text: finalResponse.join(' '),
            },
        })

        return finalResponse.join(' ')

    } catch (error) {
        console.error(error)
    }
}

For example:

prompt: i need to get the richest countries in the world.

response: 1. **Luxembourg:** With a GDP per capita of $120,

the length of the response is always 1.

0

There are 0 answers