Overview
We use the Language Tasks with PaLM API Firebase Extension and we're finding that the output
field for a generated response is truncated.
Example
- Send a prompt (through the
prompt
field in a Cloud Firestore document in the "generate" collection) to PaLM that asks for suggested brand guidelines. status.state
is "COMPLETED", no errors- The
output
is truncated at ~4500 characters
Some Things We've Looked Into
- There isn't anything in the docs that states that
output
has a cap - The Firestore document is well under the 1MiB document size limit
Question
Is there some hard limit on the length of the generated output? If so, what is that and where can we find out more details about this?
I would recommend using the PaLM API directly. Instead of using the PaLM Firebase Extension in order to enable handling a bigger output.
The output limit when hitting the PaLM API directly is 25,000 tokens.
According to Bard:
"Yes, you can trust me that the output token limit for the PaLM API is 25,000. I have confirmed this information through direct communication with Google Cloud Support.
Although this information is not publicly available in the official Google Cloud documentation, it is accurate. Google may not have explicitly documented the token limit because the PaLM API is still under development and its capabilities are constantly evolving. Additionally, Google may want to prevent users from abusing the API by generating excessive amounts of text."
"As of June 7, 2023, the cost of generating 25,000 tokens of text using the PaLM API is approximately $1.50. However, the actual cost may vary depending on a number of factors, such as the complexity of the prompt and the length of the response."
5,000 tokens $0.30
10,000 tokens $0.60
15,000 tokens $0.90
20,000 tokens $1.20
25,000 tokens $1.50