Azure CosmosDb continuation token for SELECT with JOIN

286 views Asked by At

I'm using async client from java library com.azure:azure-cosmos:4.3.0 to connect and query azure cosmosdb(SQL) in direct mode. All the documents in the collection have the following structure

{
 "id":"string",
 "timestamp":123456789,
 "tags":["tag1", "tag2"]
}

What I want to do is find all documents that have any matching tags from a given list of input tags. What is the best way to do this?

What I've tried and not able to make it work -

I inserted four documents to test with and then I ran this query in azure portal - SELECT DISTINCT VALUE CONTAINER_ALIAS FROM CONTAINER_ALIAS JOIN CONTAINER_ALIAS_tags IN CONTAINER_ALIAS.tags WHERE CONTAINER_ALIAS_tags IN ( "tag1", "tag2" ) AND ( CONTAINER_ALIAS.timestamp BETWEEN 1599773389000 AND 1602365389000 ). This gave me 3 documents as response which is right because out of 4 test documents, one of the document doesn't have "tag1" or "tag2"

Then I tried to generate the same query using the client library and it gets to SELECT DISTINCT VALUE CONTAINER_ALIAS FROM CONTAINER_ALIAS JOIN CONTAINER_ALIAS_tags IN CONTAINER_ALIAS.tags WHERE CONTAINER_ALIAS_tags IN ( "tag1", "tag2" ) AND ( CONTAINER_ALIAS.timestamp BETWEEN @timestamp_START AND @timestamp_End )

This query is passed to the database as follows

List<SqlParameters> sqlParameters = Arrays.asList(
    new SqlParameter("@timestamp_START", 1599773389000),
    new SqlParameter("@timestamp_End", 1602365389000)
);
SqlQuerySpec sqlQuerySpec = new SqlQuerySpec(query, sqlParameters)
long pageSize=20
container.queryItems(sqlQuerySpec, MyCustomEntity.class)
                    .byPage(continuationToken, pageSize)
                    .take(1)
                    .next()

But this only returns 1 document in result. But it does return a continuation token {"lastHash":"","sourceToken":"{\"token\":null,\"range\":\"{\\\"min\\\":\\\"05C1CFFFFFFFF8\\\",\\\"max\\\":\\\"05C1D7FFFFFFFC\\\",\\\"isMinInclusive\\\":true,\\\"isMaxInclusive\\\":false}\"}"}

I have used the client library with WHERE but without JOIN and I was able to make it work properly. The result would include pageSize number of documents and it would return a proper continuation token if there are more documents available. However, when I use join I'm not getting all the results

Other thing I noticed was the RU when I run from azure portal for the above query is around 46 but when I print the logs from the client library the RU is around 2. Not sure if this information is helpful in anyway to figure what I'm doing wrong

Please let me know what I'm doing wrong or is there a better way to achieve this

0

There are 0 answers