Mongo Driver for Azure Cosmos - Reads and Rate Limiting

3.1k views Asked by At

Having an issue with the following configuration,

Driver version : 3.12.1, mongodb-driver for Java

Server Version: 3.2 of Mongo API for Azure Cosmos DB (Ancient, I know)

We run some fairly high read/write loads and may hit rate limiting from the Cosmos API for Mongo. In this case, I expect an exception to occur. We're doing pretty vanilla queries, code snippet looks similar to

public DatabaseQueryResult find(String collectionName, Map<String, Object> queryData) {

    Document toFind = new Document(queryData);
    MongoCollection<Document> collection = this.mongoDatabase.getCollection(collectionName);

    FindIterable<Document> findResults = collection.find(toFind);

    if (findResults != null) {
        Document dataFound = findResults.first();
        return new DatabaseQueryResult(dataFound.toJson(this.settings))     
    }

    // other stuff...
}

When rate limited by Azure, you'll receive a response like so

{
   "$err":"Message: {\"Errors\":[\"Request rate is large. More Request Units may be needed, so no changes were made. Please retry this request later. Learn more: http://aka.ms/cosmosdb-error-429\"]}\r\n s",
   "code":16500,
   "_t":"OKMongoResponse",
   "errmsg":"Message: {\"Errors\":[\"Request rate is large. More Request Units may be needed, so no changes were made. Please retry this request later. Learn more: http://aka.ms/cosmosdb-error-429\"]}\r\n",
   "ok":0
}

I expect an exception to be thrown here - but that doesn't seem to be the case with the later driver. What's happening is,

  • collection.find is returning a FindIterable with the JSON error result as above as the first document
  • We're eventually returning a DatabaseQueryResult with JSON error as the query payload

I don't want this to happen - I'd much prefer the mongo driver to throw a MongoCommandException/MongoQueryException if a query operation returns an OKMongoResponse where "ok" 0. This seems fine on writes, which will use a CommandProtocol object and the response is validated as I'd expect - it's just reads that seems to have changed.

Comparing the 2 driver versions, this seems to be a change in read behaviour - perhaps due to retryable reads that were introduced in version 3.11? Response validation now seems to be around this section.

Q: Is there a way to configure my Mongo client so that the driver will validate server responses on read operations and throw an exception if it receives a OKMongoResponse, and ok == 0?

I can of course validate the results myself, but I'd prefer not to and let the driver do this if possible

2

There are 2 answers

3
Mark Brown On

I'm not sure why Mongo changed this driver. There is something on the Cosmos side which may help. You can raise a support ticket and ask them to turn on server-side retries. This will change the behavior of Cosmos such that requests will queue up rather than throw 429's when there are too many.

This more reflects how Mongo behaves when running on a VM or in Atlas (which also runs on VM's) rather than a multi-tenant service like Cosmos DB.

3
D. SM On

With 3.2-3.4 servers the drivers use find command described here, not OP_QUERY.

The driver surely is not "returning OKMongoResponse" since it isn't written for cosmosdb.

If you think there is a driver issue, update the question with exact wire protocol response received and the exact result you receive from the driver.

Retryable writes require sessions (which cosmosdb advertises but does not support, see Importing BSON to CosmosDB MongoDB API using mongorestore) and normally use the OP_MSG protocol which come with 3.6+ servers. I don't know what drivers would do if a 3.2 server advertises session support, this isn't a combination that is possible with MongoDB.

Note that MongoDB does not support cosmosdb (and consequently MongoDB drivers don't, officially, either).