I try to get every latest entry of a "group" out of Mongo.
You can image the data like a set of wiki entries:
- Each entry can be edited (and old versions are kept).
- We are interested in the latest version of the entries.
- The entries (and their ancestors) are sorted by a certain logic.
To get more into detail:
- Each entry is identified by an
itemId
- Each
itemId
can occur 1 to n times in the dataset - The entry versions are sorted by
orderHash
Here you can see a small batch of data:
{
"_id": ObjectId("5f69b963b8705200282d6174"),
"itemId": "xszy",
"orderHash": "383u",
"title": "A",
"content": "ABC"
},
{
"_id": ObjectId("5f69b963b8705200282d6175"),
"itemId": "e92q",
"orderHash": "5j12",
"title": "K",
"content": "KLMN"
},
{
"_id": ObjectId("5f69b963b8705200282d6178"),
"itemId": "xszy",
"orderHash": "p578",
"title": "A",
"content": "ABD"
},
{
"_id": ObjectId("5f69b963b8705200282d6180"),
"itemId": "mtewy",
"orderHash": "383u",
"title": "L",
"content": "CASE"
},
{
"_id": ObjectId("5f69b963b8705200282d6189"),
"itemId": "mtewy",
"orderHash": "5j12",
"title": "L1",
"content": "CASE"
}
I use the following aggregate pipeline to get my desired result:
db.getCollection('wiki').aggregate([
{
// create a sortable field for mongo
$set: {
"sortField": {
"$indexOfArray": [
[ "p578", "5j12", "383u" ], "$orderHash"
]
}
}
},
{
// sort by created sort field
"$sort": {
"sortField": 1
}
},
{
// group items by itemId and "save" them sorted in an array
$group: {
_id: "$itemId",
data: {
$push: "$$ROOT"
}
}
},
{
// get first entry of each "group array" to obtain
// the latest version of each entry
$project: {
resp: { $arrayElemAt: ['$data', 0] }
}
}
])
So far this is working. But it feels wrong to store the data in an extra array and I'm not sure about the performance (especially for larger data sets with a lot of edits). Another flaw is that I don't simply get an array of documents back but the data is nested in a documents property resp
(which is not good to use for e.g. mongoose).
Is there a more mongo-nic solution for my problem?
SOLUTION:
Thanks to Joe (see comments). He pointed me to the solution.
db.getCollection('wiki').aggregate([
{
$set: {
"sortField": {
"$indexOfArray": [
[ "p578", "5j12", "383u" ], "$orderHash"
]
}
}
},
{
"$sort": {
"sortField": 1
}
},
{
$group: {
_id: "$itemId",
resp: {
$first: "$$ROOT"
}
}
},
{
$replaceRoot: {
newRoot: "$resp"
}
}
])