Removal of Duplicate Rows from Data table Based on Multiple columns

Question

Removal of Duplicate Rows from Data table Based on Multiple columns

9.5k views Asked by Pradeep At 11 November 2014 at 08:29

I have data table which contains many duplicate rows i need to filter those rows from data table based upon multiple columns to get distinct rows in resultant data table....

Barcode Itemid PacktypeId

1      100      1

1      100      2

1      100      3

1      100      1

1      100      3

need only rows which contains packtypeid 1,2,3 remaining 4th and 5th row should be removed

I have tried using two methods but none didn't turns for better result

Data table contains more than 10 columns but unique column's is "Barcode", "ItemID", "PackTypeID"

Method-1:

 dt_Barcode = dt_Barcode.DefaultView.ToTable(true, "Barcode", "ItemID", "PackTypeID");

The above method filter's the rows but it returns columns only 3 column values i need entire 10 column values.

Method-2:
                   List<string> keyColumns = new List<string>();
                   keyColumns.Add("Barcode");
                   keyColumns.Add("ItemID");
                   keyColumns.Add("PackTypeID");   
           RemoveDuplicates(DataTable table, List<string> keyColumns)
            {
            var uniqueness = new HashSet<string>();
            StringBuilder sb = new StringBuilder();
            int rowIndex = 0;
            DataRow row;
            DataRowCollection rows = table.Rows;             
            int i = rows.Count;
            while (rowIndex < i)
            {
                row = rows[rowIndex];
                sb.Length = 0;
                foreach (string colname in keyColumns)
                {
                    sb.Append(row[colname]);
                    sb.Append("|");
                }

                if (uniqueness.Contains(sb.ToString()))
                {
                    rows.Remove(row);
                }
                else
                {
                    uniqueness.Add(sb.ToString());
                    rowIndex++;
                }
               }

The Above Method returns exception like there is no rows at position 5

Original Q&A

There are 3 answers

galenus On 11 November 2014 at 09:47

It happens because you remove rows.

If you want to preserve the same algorithm, instead of using while (rowIndex < i) use this form of loop:

for (var rowIndex = rows.Count - 1; rowIndex >= 0; rowIndex--)
{
    ...

    if (uniqueness.Contains(sb.ToString()))
    {
        rows.Remove(row);
        rowIndex--;
    }
    ...
}

subho On 05 August 2020 at 09:11

public void RemoveDuplicatesFromDataTable(ref DataTable table, List<string> keyColumns)
        {

            Dictionary<string, string> uniquenessDict = new Dictionary<string, string>(table.Rows.Count);
            StringBuilder stringBuilder = null;
            int rowIndex = 0;
            DataRow row;
            DataRowCollection rows = table.Rows;
            string error = string.Empty;

            try
            {
                while (rowIndex < rows.Count)
                {

                    row = rows[rowIndex];

                    stringBuilder = new StringBuilder();

                    foreach (string colname in keyColumns)
                    {
                        try
                        {
                            if (row[colname].ToString() != string.Empty)
                            {
                                stringBuilder.Append(((string)row[colname]));
                            }
                            else
                            {
                                //If it comes here, means one of the keys are blank
                                error += "One of the key values is blank.";
                            }
                        }
                        catch (Exception ss)
                        {
                            error += "Error " + ss.Message + ".";
                        }
                    }

                    if (uniquenessDict.ContainsKey(stringBuilder.ToString()))
                    {
                        rows.Remove(row);
                    }
                    else
                    {
                        uniquenessDict.Add(stringBuilder.ToString().Replace(",", ""), string.Empty);
                        rowIndex++;
                    }

                }
            }
            catch (Exception ex)
            {
                error = "Failed - " + ex.Message;
            }

            if(error != string.Empty)
                Show`enter code here`(error);
        }

**Pradeep** · Accepted Answer · 2014-11-17T05:55:46+00:00

Method 3:

Instead of Trying above 2 methods i found this Linq Method something very useful

     dt_Barcode = dt_Barcode.AsEnumerable().GroupBy(r => new { ItemID = r.Field<Int64>("ItemID"), PacktypeId = r.Field<Int32>("PackTypeID") }).Select(g => g.First()).CopyToDataTable();

TechQA.

Removal of Duplicate Rows from Data table Based on Multiple columns

There are 3 answers

Related Questions in C#

Related Questions in DATATABLE

Related Questions in DATAVIEW

Popular Questions

Popular Tags

Trending Questions