How can I get LLM to only respond in JSON strings?

7.5k views Asked by At

This is how I am defining the executor

const executor = await initializeAgentExecutorWithOptions(tools, model, {
  agentType: 'chat-conversational-react-description',
  verbose: false,
});

Whenever I prompt the AI I have this statement at the end.

type SomeObject = {
  field1: number,
  field2: number,
}

- It is very critical that you answer only as the above object and JSON stringify it as a single string.
  Don't include any other verbose explanatiouns and don't include the markdown syntax anywhere.

The SomeObject is just an example. Usually it will have a proper object type. When I use the executor to get a response from the AI, half the time I get the proper JSON string, but the other half the times are the AI completely ignoring my instructions and gives me a long verbose answer in just plain English...

How can I make sure I always get the structured data answer I want? Maybe using the agentType: 'chat-conversational-react-description' isn't the right approach here?

4

There are 4 answers

1
jeff On BEST ANSWER

Update Nov. 6, 2023

OpenAI announced today a new “JSON Mode” at the DevDay Keynote. When activated the model will only generate responses using the JSON format.

You can refer to the official docs here.

Original Answer

That's a great question and LangChain provides an easy solution. Look at LangChain's Output Parsers if you want a quick answer. It is the recommended way to process LLM output into a specified format.

Here's the official link from the docs:


Side note: I wrote an introductory tutorial about this particular issue but for Python, so if anyone else is interested in more details you can check it out here.

The example below does not use initializeAgentExecutorWithOptions, but will ensure that the output is processed as JSON without specifying this explicitly in your system prompt.

How it works

In order to tell LangChain that we'll need to convert the LLM response to a JSON output, we'll need to define a StructuredOutputParser and pass it to our chain.

Defining our parser:

Here's an example:

// Let's define our parser
const parser = StructuredOutputParser.fromZodSchema(
  z.object({
    field1: z.string().describe("first field"),
    field2: z.string().describe("second field")
  })
);

Adding it to our Chain:


// We can then add it to our chain
const chain = RunnableSequence.from([
  PromptTemplate.fromTemplate(...),
  new OpenAI({ temperature: 0 }),
  parser, // <-- this line
]);

Invoking our chain with format_instructions:

// Finally, we'll pass the format instructions to the invoke method
const response = await chain.invoke({
  question: "What is the capital of France?",
  format_instructions: parser.getFormatInstructions(), // <-- this line
});

Go ahead and log the parser.getFormatInstructions() method before you call invoke if you'd like to see the output.

When we pass parser.getFormatInstructions() to the format_instructions property, this lets LangChain append the desired JSON schema that we defined in step 1 to our prompt before sending it to the large language model.

As a final point, it is absolutely critical to make sure your query/prompt is relevant and produces values that could be interpreted as the properties in your object SomeObject that are defined in the parser.

Please give this a try, and let me know if you're able to consistently output JSON.

0
luona.dev On

Generally speaking: Due to the nature of LLMs, you can never guarantee a JSON response. You will have to adopt your strategy to cope with this fact. Out of the top of my head these are your options:

Prompt engeneering --> gets you to 95%

With careful prompting and specific instructions you can maximize the likelihood of getting a JSON response. There are a lot of ressources on prompt engineering out there, but since it is model dependent and subject to change, you will always have to experiment on what works best for your case.

Post-Processing --> gets you to 99%

Make your application code more resilient towards non JSON-only for example you could implement a regular expression to extract potential JSON strings from a response. As an example a very naive approach that simply extracts everything between the first { and the last }

const naiveJSONFromText = (text) => {
    const match = text.match(/\{[\s\S]*\}/);
    if (!match) return null;

    try {
        return JSON.parse(match[0]);
    } catch {
        return null;
    }
};

Validation Loop --> gets you to 100%

In the end you will always have to implement validation logic to check A: That you deal with a valid JSON object and B: That it has your expected format.

const isValidSomeObject = (obj) =>
    typeof obj?.field1 === 'number' && typeof obj?.field2 === 'number';

Depending on your use case I would recommend to automatically query the LLM again if this validation fails.

Closing thought: Even though the prompt engineering gets you the furthest. I would recommend to implement the other parts first to be able to get going and then try to reduce the amounts of validation fails by improving your prompt.

0
Noam On

The solution depends on whether you are using closed or open models.

  • For OpenAI, use JSON Mode as mentioned above
  • For other API servers, acce via REST API usually, you can converse with the API via chat and tell it what the problems are until it gets it right. An example library that does this is TypeChat
  • If you are running your own LLM, you can use decoder libraries such as lm-format-enforcer which has langchain integration, jsonformer , guidance and outlines.
0
Farooq Zaman On

Generating structured JSON from language models is a challenging task. The generated JSON must be syntactically correct, and it must conform to a schema that specifies the structure of the JSON.

Current approaches to this problem are brittle and error-prone. They rely on prompt engineering, fine-tuning, and post-processing, but they still fail to generate syntactically correct JSON in many cases.

Jsonformer is a new approach to this problem. In structured data, many tokens are fixed and predictable. Jsonformer is a wrapper around Hugging Face models that fills in the fixed tokens during the generation process, and only delegates the generation of content tokens to the language model. This makes it more efficient and bulletproof than existing approaches.

This currently supports a subset of JSON Schema. Below is a list of the supported schema types:

  • number
  • boolean
  • string
  • array
  • object

install:

pip install jsonformer

Example:

from jsonformer import Jsonformer
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("databricks/dolly-v2-3b")
tokenizer = AutoTokenizer.from_pretrained("databricks/dolly-v2-3b")
json_schema = {
   "type": "object",
   "properties": {
      "name": {"type": "string"},
      "age": {"type": "number"},
      "is_student": {"type": "boolean"},
       "courses": {
           "type": "array",
           "items": {"type": "string"}
         }
     }
 }

prompt = "Generate a person's information based on the following schema:"
jsonformer = Jsonformer(model, tokenizer, json_schema, prompt)
generated_data = jsonformer()
print(generated_data)

here is the output generated:

{
  car: {
    make: "audi",
    model: "model A8",
    year: 2016.0,
    colors: [
      "blue"
    ],
    features: {
      audio: {
        brand: "sony",
        speakers: 2.0,
        hasBluetooth: True
      },
      safety: {
        airbags: 2.0,
        parkingSensors: True,
        laneAssist: True
      },
      performance: {
        engine: "4.0",
        horsepower: 220.0,
        topSpeed: 220.0
      }
    }
  },
  owner: {
    firstName: "John",
    lastName: "Doe",
    age: 40.0
  }
}