Mongo aggregation pipeline, finding out the total number of entries in an array per user

82 views Asked by At

I have a collection, lets call it 'user'. In this collection there is a property entries, which holds a variably sized array of strings,

I want to find out the total number of these strings across my collection.

db.users.find()
> [{ entries: [] }, { entries: ['entry1','entry2']}, {entries: ['entry1']}]

So far I have have made many attempts here are some of my closest.

     db.users.aggregate([
      { $project:
        { numberOfEntries:
          { $size: "$entries" } }
          },
           { $group: 
             {_id: { total_entries: { $sum: "$entries"}
                     }
                  } 
               }
            ])

What this gives me is a list of the users with the total number of entries, now what I want is each of the total_entries figures added up to get my total. Any ideas of what I am doing wrong. Or if there is a better way to start this?

3

There are 3 answers

1
Andy Macleod On BEST ANSWER

A possible solution could be:

db.users.aggregate([{
    $group: {
        _id: 'some text here',
        count: {$sum: {$size: '$entries'}}
    }
}]);

This will give you the total count of all entries across all users and look like

[
    {
        _id: 'some text here',
        count: 3
    }
]

I would use $unwind in the case that you want individual entry counts. That would look like

db.users.aggregate([
    { $unwind: '$entries' },
    {$group: {
        _id: '$entries',
        count: {$sum: 1}
    }
])

and this will give you something along the lines of:

[
    {
        _id: 'entry1',
        count: 2
    },
    {
        _id: 'entry2',
        count: 1
    }
]
0
chridam On

You were in the right direction though you just needed to specify an _id value of null in the $group stage to calculate accumulated values for all the input documents as a whole i.e.

db.users.aggregate([
    {
        "$project": {
            "numberOfEntries": { 
                "$size": {
                    "$ifNull": ["$entries", []]
                }
            }          
        }
    },
    {
        "$group": {
            "_id": null, /*  _id of null to get the accumulated values for all the docs */
            "totalEntries": { "$sum": "$numberOfEntries" }
        }       
    }
])

Or with just a single pipeline as:

db.users.aggregate([
    {
        "$group": {
            "_id": null, /*  _id of null to get the accumulated values for all the docs */
            "totalEntries": { 
                "$sum": { 
                    "$size": {
                        "$ifNull": ["$entries", []]
                    }
                }   
            }
        }       
    }
])
1
Xavier Guihot On

In case you want the overall distinct nbr of entries:

> db.users.aggregate([
   { $unwind: "$entries" },
   { $group: { _id: "$entries" } },
   { $count: "total" }
])
{ "total" : 2 }

In case you want the overall nbr of entries:

> db.users.aggregate( [ { $unwind: "$entries" }, { $count: "total" } ] )
{ "total" : 3 }

This makes use of the "unwind" operator which flattens elements of an array from records:

> db.users.aggregate( [ { $unwind: "$entries" } ] )
{ "_id" : ObjectId("5a81a7a1318e1cfc10250430"), "entries" : "entry1" }
{ "_id" : ObjectId("5a81a7a1318e1cfc10250430"), "entries" : "entry2" }
{ "_id" : ObjectId("5a81a7a1318e1cfc10250431"), "entries" : "entry1" }