LINQ Except/Distinct based on few columns only, to not add duplicates

2k views Asked by At

Excuse me for the confusing title. But I couldn't come up with something more simpler.

I have got below class.

public class Foo
{
    public FooId int { get; set; }

    public FooField string { get; set; }

    //some more fields here....
    //....
    //....

    public DateTime CreatedDate { get; set; }
}

I need to loop through something and add a range of Foo to a List<Foo>, but it should not be duplicate based on the combination of FooId and FooField.

So I am trying as below

List<Foo> foos = new List<Foo>();

foreach (Bar item in blah.Bars)
{
    //Some code here to get the foos from the item
    List<Foo> processedFoos = item.GetFoos();

    //The problem is with below line
    foos.AddRange(processedFoos.Except(foos));
}

The Except adds all the records and duplicates the FooId and FooField combination as CreatedDate would be unique for all records.

But I need to ignore the CreatedDate and just add those records which does not violate the unique combination.

What can I do here? Except? Distinct? Any other alternative?

Most important, how?

4

There are 4 answers

0
Jon Skeet On BEST ANSWER

You either need to override Equals in Foo, or implement an IEqualityComparer<Foo> so that Except can tell when two Foo values are equal. For example:

public sealed class FooComparer : IEqualityComparer<Foo>
{
    public bool Equals(Foo x, Foo y)
    {
        return x.FooId == y.FooId && x.FooField == y.FooField;
    }

    public int GetHashCode(Foo foo)
    {
        // Note that if FooField can be null, you'll need to account for that...
        return foo.FooId ^ foo.FooField.GetHashCode();
    }
}

Then:

 foos.AddRange(processedFoos.Except(foos, new FooComparer()));
0
Charles Mager On

Both Except and Distinct will compare items based on the object's Equals implementation (this is the default equality comparer). They also both have overloads that take an IEqualityComparer<T>.

You could implement this interface (see the documentation for an example) comparing only the fields you require and pass it to Except:

 foos.AddRange(processedFoos.Except(foos, comparer));
1
suvroc On

I think you should change the way how you select not duplicated records:

processedFoos.Where(x => !foos.Any(y => y.FooId == x.FooId && y.FooField == x.FooField))
1
Dennis_E On

Except and Distinct both have an overload that takes an IEqualityComparer. You can implement one to compare only the properties you need.

An alternative is to group by FooId and FooField and take the first element from each group:

processFoos.GroupBy(foo => new {foo.FooId, foo.FooField})
.Select(g => g.First());