Most efficient way to clone a list into an existing list, minimizing memory reallocation?

205 views Asked by At

I have a need to clone an existing list into another, existing list. As the environment demands very high performance, I need to eliminate unnecessary memory reallocation.

The most efficient algorithm I can think of is the following, which will increase the capacity of the destination list to match the source one as required, but will not decrease its own. (This is acceptable behavior for this project.)

    public static void CloneInto(this List<T> source, List<T> destination)
    {
        if (destination.Capacity < source.Capacity)
        {
            /* Increase capacity in destination */
            destination.Capacity = source.Capacity;
        }

        /* Direct copy of items within the limit of both counts */
        for (var i = 0; i < source.Count && i < destination.Count; i++)
        {
            destination[i] = source[i];
        }

        if (source.Count > destination.Count)
        {
            /* Append any extra items from source */
            for (var i = destination.Count; i < source.Count; i++ )
            {
                destination.Add(source[i]);
            }
        } 
        else if (source.Count < destination.Count)
        {
            /* Trim off any extra items from destination */
            while (destination.Count > source.Count)
            {
                destination.RemoveAt(destination.Count - 1);
            }
        }
    }

However this seems a lot of code, logic and loops.

Is there a more efficient way to clone a list into an existing list while avoiding unnecessary memory allocations?

3

There are 3 answers

5
xanatos On BEST ANSWER
if (destination.Capacity < source.Capacity)
{
    /* Increase capacity in destination */
    destination.Capacity = source.Capacity;
}

This is probably wrong... The source.Capacity could be bigger than necessary...

And you are "copying" the elements already contained in destination to a "new" destination "buffer". This copy is unnecessary because then the elements of destination are discarded

So you should at least:

if (destination.Capacity < source.Count)
{
    /* Increase capacity in destination */
    destination.Clear();
    destination.Capacity = source.Capacity;
}

In this way, if the Capacity is changed, no elements need to be copied (note that I'm using the Capacity = Count, because while this would economize memory in the short term, in the long term it could cause multiple allocations of memory). Note that I'm checking against the source.Count, but for the assignment of Capacity I'm using the same source.Capacity. This to minimize reallocations of destination if the method is called multiple times.

Small optimization:

if (destination.Capacity < source.Count)
{
    /* Increase capacity in destination */
    destination.Capacity = 0;
    destination.Capacity = source.Capacity;
}

This because List.Clear() uses Array.Clear(), so it really zeroes the internal elements, while destination.Capacity = 0 is optimized to simply reassign the internal array to a static _emptyArray.

You could even try:

public static void CloneInto(this List<T> source, List<T> destination)
{
    destination.Capacity = 0; // or destination.Clear();
    destination.AddRange(source);
}

but by looking at the source code, I think it is slower than necessary (because it first copies the elements to T[] itemsToInsert = new T[count];, so it does two copies of the elements...

Then:

while (destination.Count > source.Count)
{
    destination.RemoveAt(destination.Count - 1);
}

can be optimized to:

destination.RemoveRange(source.Count, destination.Count - source.Count);
6
LInsoDeTeh On

Why don't you just use

destination = source.ToList();

ToList() creates a copy of the list and everything which was in destination before, will immediately be ready for garbage collection after the assignment.

0
Fabjan On

Little test on perfomance :

public static List<T> CloneInto<T>(List<T> source, List<T> destination)
{
    if (destination.Capacity < source.Capacity)
    {
        /* Increase capacity in destination */
        destination.Capacity = source.Capacity;
    }

    /* Direct copy of items within the limit of both counts */
    for (var i = 0; i < source.Count && i < destination.Count; i++)
    {
        destination[i] = source[i];
    }

    if (source.Count > destination.Count)
    {
        /* Append any extra items from source */
        for (var i = destination.Count; i < source.Count; i++)
        {
            destination.Add(source[i]);
        }
    }
    else if (source.Count < destination.Count)
    {
        /* Trim off any extra items from destination */
        while (destination.Count > source.Count)
        {
            destination.RemoveAt(destination.Count - 1);
        }
    }

    return destination;
}


static void Main(string[] args)
{
    List<string> list1 = new List<string>();
    List<string> list2 = new List<string>();

    for (int i = 0; i < 100000; i++)
    {
        list1.Add(Guid.NewGuid().ToString());                
    }

    Stopwatch s = new Stopwatch();

    s.Start();
    CloneInto(list1, list2);

    s.Stop();

    Console.WriteLine("Your Method: " + s.Elapsed.Ticks);

    s.Reset();
    s.Start();

    list2 = list1.ToList();

    s.Stop();

    Console.WriteLine("ToList() Method: " + s.Elapsed.Ticks);

    Console.ReadKey();

result:

Test

But if it is all about memory only - your method is better than .ToList() and there isn't much you can do to improve performance even more. May be you can use parallel loops like parallel.for but not sure about that.