Azure OpenAI on your data - System message usage

817 views Asked by At

I'm working with Azure OpenAI on your data, trying to understand:

  • why system message seems to not work as expected
  • strategy for providing instructions in “on your data” case

For example, calling POST https://{service-name}.openai.azure.com/openai/deployments/{model-name}/extensions/chat/completions?api-version=2023-08-01-preview

{
    "messages": [
        {
            "role": "system",
            "content": "Your task is to always respond in French."
        },
        {
            "role": "user",
            "content": "How to cherry pick a PR?"
        }
    ],
    "temperature": 0.5,
    "max_tokens": 12000,
    "top_p": 1,
    "dataSources": [
        {
            "type": "AzureCognitiveSearch",
            "parameters": {
                ...
                "queryType": "semantic",
                "inScope": true,
                "roleInformation": "Your task is to always respond in French."
            }
        }
    ]
}

I get the following response:

{
    "id": "GUID",
    "model": "gpt-4-32k",
    "created": timestamp,
    "object": "extensions.chat.completion",
    "choices": [
        {
            "index": 0,
            "finish_reason": "stop",
            "message": {
                "role": "assistant",
                "content": "The most common way to cherry pick your PR ...",
                "end_turn": true,
                "context": {
                    "messages": [
                        {
                            "role": "tool",
                            "content": "{\"citations\": [{\"content\": \"{citation_content}\", \"intent\": \"How to cherry pick a PR?\"}",
                            "end_turn": false
                        }
                    ]
                }
            }
        }
    ]
}

As we can see, system message had no effect. Is this a known issue?

Example providing instructions in user’s prompt, main problem is that search query used in Azure Search contains parts of that prompt.

POST https://{service-name}.openai.azure.com/openai/deployments/{model-name}/extensions/chat/completions?api-version=2023-08-01-preview

{
    "messages": [
        {
            "role": "user",
            "content": "How to cherry pick a PR? When responding to this query, please translate the message to French."
        }
    ],
    "temperature": 0.5,
    "max_tokens": 12000,
    "top_p": 1,
    "dataSources": [
        {
            "type": "AzureCognitiveSearch",
            "parameters": {
                ...
                "queryType": "semantic",
                "inScope": true,
                "roleInformation": "Your task is to always respond in French."
            }
        }
    ]
}

I get the following response:

{
    "id": "GUID",
    "model": "gpt-4-32k",
    "created": timestamp,
    "object": "extensions.chat.completion",
    "choices": [
        {
            "index": 0,
            "finish_reason": "stop",
            "message": {
                "role": "assistant",
                "content": "Voici comment vous pouvez choisir un PR ...",
                "end_turn": true,
                "context": {
                    "messages": [
                        {
                            "role": "tool",
                            "content": "{\"citations\": [{\"content\": \"This is the most common way ....\", \"intent\": \"How to cherry pick a PR? When responding to this query, please translate the message to French.\"}",
                            "end_turn": false
                        }
                    ]
                }
            }
        }
    ]
}
  • prompt instruction directly on user’s prompt worked, but the search query contains the prompt (as you can see intent, that’s the field used to search in search service) .

Two main questions:

  • how to use the system message? I’ve tried different approaches (like using the roleInformation in dataSources or moving the system message after the user prompt) but none seems to work. Is this a known issue?
  • how can I provide instructions to the model. Maybe I want to respond with a summary or I want to respond in a specific way on each iteration. How can I provide that kind of instruction if that instruction will be included in the search query? I’m sure there’s a better strategy here.
2

There are 2 answers

1
Ram On

Yes it's known bug, we have created backlog item for this issue and this is a great fine-tuning issue. If you are using Turbo does not prioritize the system message instructions quite as highly as 0613 model version.

You can try giving the different system messages as given below.

gpt-35-turbo 0613, temperature = 0.4, topp = 0.4 Please respond your answer in French.

0
Nicolas R On

As stated by Ram, this is a known issue when using this "all-in-one" endpoint.

And I also faced it with almost all models, Turbo or not. My advise would be to avoid using this "all-in-one" approach and split it in 2 parts:

  • document retrieval first
  • then answer generation, including the output of this 1st step and your specific system message

With that, you will solve this kind of behavior, and have a better control on how your retrieved data is used to generate the answer.

Note: splitting that way will also avoid this ugly way of transiting the CognitiveSearch service key in the payload of the "all-in-one" call, which is a bad practice in terms of security from my point of view!