Performance of Azure Cosmos DB of element with big property containing html

729 views Asked by At

We're using Azure Cosmos DB Graph API to cache items from a CMS that have properties containing a fairly big chunk of html.

When adding 8000 items Cosmos DB is starting to be very slow.

For instance this simple query takes about 12-15 seconds to complete:

g.V().hasLabel('news').limit(10)

Data in each vertex is around around 4-5 kb and I've excluded the Content-property in graph settings.

I've increased the RU to 5000/s and the Monitor-tab in Azure Portal seem to indicate is enough. Estimating throughput needs suggests that 5000 RU should be enough for 500 reads/s but I can't even do one.

Querying out items without the html-property like g.V().hasLabel('user') is still fast.

I also tried to exclude the path from indexing but no difference (haven't reloaded items if that is necessary?)

"excludedPaths": [
        {
            "path": "/Content/?"
        }
    ]

What can I do to get this up to speed?

1

There are 1 answers

1
Michael Finger On BEST ANSWER

If you are using the .NET SDK, it appears that the request retrieves all of the results for the "hasLabel" filter and performs the "limit" filtering in the client side SDK code.

I sniffed a few queries with the "limit" extension in Fiddler and no matter the value, the query in the request does not contain a TOP clause. The document db query in the body of the request looks like: {"query":"SELECT N_2 FROM Node N_2 WHERE (IS_DEFINED(N_2._isEdge) = false AND (N_2.label = 'news'))"}

I would expect it to be: {"query":"SELECT TOP 10 N_2 FROM Node N_2 WHERE (IS_DEFINED(N_2._isEdge) = false AND (N_2.label = 'news'))"}