Getting previous page's results when using skip and limit while querying DocumentDB

207 views Asked by At

I have a collection with ~800,000 documents, I am trying to fetch all of them, 5,000 at a time.

When running the following code:

const CHUNK_SIZE = 5000;

let skip = 0;

do {
  matches = await dbClient
    .collection(collectionName)
    .find({})
    .skip(skip)
    .limit(CHUNK_SIZE)
    .toArray();

    // ... some processing
  skip += CHUNK_SIZE;
} while (matches.length)

After about 30 iterations, I start getting documents I already received in a previous iteration.

What am I missing here?

1

There are 1 answers

0
Skami On

As posted in the comments, you'll have to apply a .sort() on the query. To do so without adding too much performance overhead it would be easiest to do this on the _id e.g.

.sort(
    { 
        "_id" : 1.0
    }
)

Neither MongoDB or the AmazonDocumentDB flavor guarantees implicit result sort ordering without it.

Amazon DocumentDB

Mongo Result Ordering