Aggregate C# groupby in array

757 views Asked by At

I have the following structure and I want to aggregate over a list of movies to get all the actors and their image.

{
"_id" : 1,
"Title" : "Pirates of the Caribbean: The Curse of the Black Pearl",
"Actors" : [ 
    {
        "Name" : "Johnny Depp",
        "NameInMovie" : "Captain Jack Sparrow",
        "ImageUrl" : "Johnny-Depp.png"
    }, 
    {
        "Name" : "Geoffrey Rush",
        "NameInMovie" : "Captain Hector Barbossa",
        "ImageUrl" : "Geoffrey-Rush.png"
    }, 
    {
        "Name" : "Orlando Bloom",
        "NameInMovie" : "Will Turner",
        "ImageUrl" : "Orlando-Bloom.png"
    }
]

},

{
"_id" : 2,
"Title" : "Pirates of the Caribbean: At World's End",
"Actors" : [ 
    {
        "Name" : "Johnny Depp",
        "NameInMovie" : "Captain Jack Sparrow",
        "ImageUrl" : "Johnny-Depp.png"
    }, 
    {
        "Name" : "Geoffrey Rush",
        "NameInMovie" : "Captain Hector Barbossa",
        "ImageUrl" : "Geoffrey-Rush.png"
    }, 
    {
        "Name" : "Orlando Bloom",
        "NameInMovie" : "Will Turner",
        "ImageUrl" : "Orlando-Bloom.png"
    }
]

}

Tried to get the result using the Linq AsQueryable function, but it is not supported:

var actors = Movies
.AsQueryable()
.SelectMany(x => x.Actors)
.GroupBy(x => x.Name)
.Select(a => a.First())
.ToList();

How to get this working with MongoDB driver2

The result should like like

{"Name" : "Johnny Depp",
"ImageUrl" : "Johnny-Depp.png" }, 
{"Name" : "Geoffrey Rush",
"ImageUrl" : "Geoffrey-Rush.png" }, 
{"Name" : "Orlando Bloom",
"ImageUrl" : "Orlando-Bloom.png" }

Solved with

var actors = Movies.AsQueryable()
.SelectMany(x => x.Actors)
.GroupBy(x => new Actor { Name = x.Name, ImageUrl = x.ImageUrl })
.Select(x => x.Key)
.ToList();

This gives the following query

[{
    "$unwind": "$Actors"
},
{
    "$project": {
        "Actors": "$Actors",
        "_id": 0
    }
},
{
    "$group": {
        "_id": {
            "ImageUrl": "$Actors.ImageUrl",
            "Name": "$Actors.Name"
        }
    }
},
{
    "$project": {
        "_id": "$_id"
    }
}]
1

There are 1 answers

0
Neil Lunn On BEST ANSWER

You were close, but mostly missing on the actual selection of the properties within the $group:

var results = Movies.AsQueryable()
 .SelectMany( x => x.Actors )
 .GroupBy( x => x.Name )
 .Select( x => new { Name = x.Key, ImgUrl = x.First().ImageUrl })
 .ToArray();

So it's the .Select() that actually needs the different syntax to what you are using. This gives you a pipeline that serializes like:

    [
            {
                    "$unwind" : "$Actors"
            },
            {
                    "$project" : {
                            "Actors" : "$Actors",
                            "_id" : 0
                    }
            },
            {
                    "$group" : {
                            "_id" : "$Actors.Name",
                            "__agg0" : {
                                    "$first" : "$Actors.ImageUrl"
                            }
                    }
            },
            {
                    "$project" : {
                            "Name" : "$_id",
                            "ImgUrl" : "$__agg0",
                            "_id" : 0
                    }
            }
    ],

Which IMHO is not completely "optimal", but that's what you get with LINQ interface.

There is alternately:

var results = Movies.AsQueryable()
 .SelectMany( x => x.Actors )
 .GroupBy( x => new { Name = x.Name, ImgUrl = x.ImageUrl })
 .ToArray();

Which comes out as:

  [
            {
                    "$unwind" : "$Actors"
            },
            {
                    "$project" : {
                            "Actors" : "$Actors",
                            "_id" : 0
                    }
            },
            {
                    "$group" : {
                            "_id" : {
                                    "Name" : "$Actors.Name",
                                    "ImgUrl" : "$Actors.ImageUrl"
                            }
                    }
            }
    ],

Which does not use $first and therefore is not "exactly" the same thing, but it does come out to the same result in your case, since ImgUrl is always the same value of each Name on the Actor.