Split List<T> into Sublists<T> with LINQ

530 views Asked by At

I've got a List with n Foo items. Foo contains a long property Foo.FileSize. Now i would like to split this list to sublists with n elements which the sum of FileSize is not more than 10000. Of cource there are items with Foo.FileSize more than 10000 as well. For this special case, just need a sublist with only this item.

Please can someone suggest something?

const long maxdownloadsize = 10485760;
long actualdownloadsize = 0;
List<TI> downloadTI = new List<TI>();

for (int i = 0; i < comparedTI.Count; i++)
{
    var ti = comparedTI[i];

    actualdownloadsize += ti.FileSize;
    downloadTI.Add(ti);

    if (actualdownloadsize > maxdownloadsize || i == comparedTI.Count-1)
    {
        actualdownloadsize = 0;
        AddToList(downloadTI);
        downloadTI = new List<TI>();
    }
}
2

There are 2 answers

0
dbc On

Honestly it's easier to do this in a traditional manner rather than with Linq, since the transformation method required for each element in the sequence requires knowledge of all previous elements. So, you could do something like this:

    public static IEnumerable<List<T>> Partition<T>(IEnumerable<T> list, Func<T, long> getValue, long maxSum)
    {
        long sum = 0;
        int partition = 0;
        var query = list.Select((i, index) =>
            {
                if (index == 0)
                {
                    // Reset the external partition counter in case the query is run multiple times.
                    sum = 0;
                    partition = 0;
                }
                var value = getValue(i);
                sum = sum + value;
                if (sum > maxSum)
                {
                    sum = value;
                    partition++;
                }
                return new KeyValuePair<int, T>(partition, i);
            })
            .GroupBy(pair => pair.Key)
            .Select(g => g.Select(pair => pair.Value).ToList());
        return query;
    }

And then call it like:

        var list = new List<Foo>();

        // Fill up the list.

        var query = Partition(list, f => f.FileSize, 10000);
        var partition = query.ToList();

While this meets the requirement of "Split List<T> into Sublists<T> with LINQ", it is actually less efficient than doing it directly.

0
Ryan On

This isn't pretty, but you can use local variables and LINQ GroupBy() to split the list into groups by size. Also, there is no optimization with this function so you may not get the optimal number of groups. In other words, if you have a large file in the middle of the list then it will start a new group even if it would be possible to fit more smaller files into the current group.

var maxGroupSize = 10485760;
var groupId = 0;
var groupCount = 0;
long groupSize = 0;
var groups = files.GroupBy(f => 
{
    var size = f.FileSize;
    if ((groupCount > 0) && ((groupSize + size) > maxGroupSize))
    {
        // Start a new group and reset count and sum
        groupId++;
        groupCount = 0;
        groupSize = 0;
    }
    groupSize += size;
    groupCount++;
    return groupId;
});

The GroupBy method returns IGrouping<TKey, TSource> and if you need to convert it to a List then do the following (assuming the TSource type is TI from your example code):

List<TI> downloadTI = groups.Select(p => new List<TI>(groups[p.Key]));