OpenAI actions for custom GPT: How to modify OpenAPI schema to send a file along with string

1.4k views Asked by At

I am making a custom GPT that connects to my own server. I am able to get it to work if only sending a string, but if I try to allow a user to also send a file (I only need it to work with image files) through the chatgpt interface it will not send the image, only the string.How do I modify the schema below to send the image as well?

{
  "openapi": "3.1.0",
  "info": {
    "title": "Send an image and a string",
    "description": "Makes it super easy to send an image and a string",
    "version": "v1.0.0"
  },
  "servers": [
    {
      "url": "https://myawesomeserver.loca.lt"
    }
  ],
  "paths": {
    "/api/gpt/create": {
      "post": {
        "description": "Create a string and image",
        "operationId": "CreateImageandString",
        "parameters": [
          {
            "name": "an_awesome_string",
            "in": "query",
            "description": "The value of the string we will create",
            "required": true,
            "schema": {
              "type": "string"
            }
          }
        ],
        "requestBody": {
          "description": "image to be uploaded",
          "required": true,
          "content": {
            "multipart/form-data": {
              "schema": {
                "type": "object",
                "properties": {
                  "image": {
                    "type": "string",
                    "format": "binary"
                  }
                }
              }
            }
          }
        },
        "deprecated": false
      }
    }
  },
  "components": {
    "schemas": {}
  }
}

3

There are 3 answers

8
Jeremy Fiel On

You had it pretty close but you missed a few things with the encoding object, which is optional, but a lot more descriptive for the payload. The other thing is the string should be sent in a json body rather than a query parameter. The query parameters should be reserved for search terms

{
  "openapi": "3.1.0",
  "info": {
    "title": "Send an image and a string",
    "description": "Makes it super easy to send an image and a string",
    "version": "1.0.0"
  },
  "servers": [
    {
      "url": "https://myawesomeserver.loca.lt"
    }
  ],
  "paths": {
    "/api/gpt/create": {
      "post": {
        "description": "Create a string and image",
        "operationId": "CreateImageandString",
        "parameters": [],
        "requestBody": {
          "description": "image to be uploaded",
          "required": true,
          "content": {
            "multipart/form-data": {
              "schema": {
                "type": "object",
                "properties": {
                  "an_awesome_string": {
                    "type": "string"
                  },
                  "image": {
                    "type": "string",
                    "format": "binary"
                  }
                }
              },
              "encoding": {
                "an_awesome_string": {
                  "headers": {
                    "content-disposition": {
                      "$ref": "#/components/headers/content-disposition"
                    }
                  },
                  "contentType": "application/json"
                },
                "image": {
                  "headers": {
                    "content-disposition": {
                      "$ref": "#/components/headers/content-disposition"
                    }
                  },
                  "contentType": "image/*"
                }
              }
            }
          }
        }
      }
    }
  },
  "components": {
    "headers": {
      "content-disposition": {
        "description": "the content-disposition header",
        "schema": {
          "type": "string"
        },
        "required": true
      }
    }
  }
}

Then you need to make sure your the body is actually formatted as a form-data request with the proper headers as required by the service. The main thing is the content-disposition header and the name properties are required. These are typically associated to the form-data elements where the data originated. Make sure the boundary is properly defined as that is how the body parts are defined.

POST https://myawesomeserver.loca.lt/api/gpt/create HTTP/1.1
Content-Type: multipart/form-data; boundary=gc0p4Jq0M2Yt08jU534c0p
  
--gc0p4Jq0M2Yt08jU534c0p
Content-Disposition: form-data; name="image"; filename="image_name.png"
Content-Type: image/png
Content-Length: <number>
 
0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/CQAGAAAAAAAAAAAAAAABAAAAJgAAAAAAAAAA
EAAAKAAAAAEAAAD+////AAAAACUAAAD/////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////s
pcEAk8EJBAAA8BK/AAAAAAABEQABAAEACAAADggAAA4AYmpiagf4B/gAAAAAAAAAAAAAAAAAAAAA
AAAJBBYANA4AAGWSAQBlkgEADgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD//w8AAAAA
AAAAAAD//w8AAAAAAAAAAAD//w8AAAAAAAAAAAAAAAAAAAAAALcAAAAAAKwFAAAAAAAArAUAAHwT
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAA=

--gc0p4Jq0M2Yt08jU534c0p
Content-Disposition: form-data; name="an_awesome_string"
Content-Type: application/json
Content-Length: <number>

{
  "an_awesome_string": "test"
}

--gc0p4Jq0M2Yt08jU534c0p--

If you're really keen to learn about the multipart message you can find more info in the RFC2045

Information about the content-disposition header can be found in RFC6266

0
huw On

In short, it’s not really possible to upload files from a GPT Action at time of writing. This is because the language model (GPT) itself generates the parameters for a GPT Action call, and the largest model output today (GPT-4-Turbo) is 4096 tokens (roughly 2 characters). This means that the largest file you could theoretically prompt the model into uploading would have to fit in that output window. Support for uploading files directly to a GPT Action by other means isn’t yet possible.

If you need your model to accept files, I would recommend having an action that generates a secure endpoint on your web server where the user can upload a file directly to your service (say https://your-service/upload?id=uniqueidhere), asking the user to click on that link and upload their file, then having another action reference that unique ID to retrieve and operate on the file. Keep in mind that you can’t use an action to pass an image to GPT-4 Vision yet either, but you could call the GPT-4 Vision API on your backend with the user’s file. None of these experiences will feel seamless, but hopefully it’s some inspiration on how to proceed.

0
GPTs App On

you'll need to update the requestBody to include both the image and the string. Since you want to send a file along with text data, using multipart/form-data is the correct approach. Here's how you can modify the schema:

    {
  "openapi": "3.1.0",
  "info": {
    "title": "Send an image and a string",
    "description": "Makes it super easy to send an image and a string",
    "version": "v1.0.0"
  },
  "servers": [
    {
      "url": "https://myawesomeserver.loca.lt"
    }
  ],
  "paths": {
    "/api/gpt/create": {
      "post": {
        "description": "Create a string and image",
        "operationId": "CreateImageandString",
        "requestBody": {
          "description": "Image and string to be uploaded",
          "required": true,
          "content": {
            "multipart/form-data": {
              "schema": {
                "type": "object",
                "properties": {
                  "image": {
                    "type": "string",
                    "format": "binary"
                  },
                  "an_awesome_string": {
                    "type": "string"
                  }
                }
              }
            }
          }
        },
        "responses": {
          // Define your response schema here
        },
        "deprecated": false
      }
    }
  },
  "components": {
    "schemas": {}
  }
}