How to properly sort a string-number column of a DataTable

1.6k views Asked by At

I am trying to sort a string-number column, e.g. N1, N10, N100, N2 and I am expecting the results N1, N2, N10, N100 but the sorting doesn't work, I am getting the same values N1, N10, N100, N2 in the same order.

I wrote the following code.

static class ExtensionMethod
{
    public static DataTable SortAlphaNumeric(this DataTable datatable, string columnName)
    {
        return datatable.AsEnumerable()
                  .OrderBy(r => r.Field<String>(columnName), new CustomComparer())
                  .CopyToDataTable();
    }
}

public class CustomComparer : IComparer<string>
{
    public int Compare(string x, string y)
    {
        var numberX = Regex.Match(x, @"\d+").Value;
        var numberY = Regex.Match(y, @"\d+").Value;

        var alphaX = Regex.Match(x, @"[^a-z]").Value;
        var alphaY = Regex.Match(y, @"[^a-z]").Value;

        if (alphaX.CompareTo(alphaY) == 0)
            return numberX.CompareTo(numberY);
        else if (alphaX.CompareTo(alphaY) < 0)
            return -1;
        return 1;            
    }
}

// Code example
class TestExample
{
    public void Test()
    {
        var dt = new DataTable();
        dt.Columns.Add("AlphaNumeric", Type.GetType("System.String"));
        var row = dt.NewRow();
        row["AlphaNumeric"] = "N1";
        dt.Rows.Add(row);
        row = dt.NewRow();
        row["AlphaNumeric"] = "N10";
        dt.Rows.Add(row);
        row = dt.NewRow();
        row["AlphaNumeric"] = "N100";
        dt.Rows.Add(row);
        row = dt.NewRow();
        row["AlphaNumeric"] = "N2";
        dt.Rows.Add(row);

        var orderedDt = dt.SortAlphaNumeric("AlphaNumeric");
    }
}
3

There are 3 answers

2
PinBack On BEST ANSWER

Change in your Comparer:

var numberX = int.Parse(Regex.Match(x, @"\d+").Value);
var numberY = int.Parse(Regex.Match(y, @"\d+").Value);
0
Haukinger On

I'd use (?<alpha>[A-Za-z])(?<number>\d+) as regex and compare alpha (string comparison), and then, if equal, int.Parse the number and compare that (integer comparsion).

That gives only two regex executions instead of four (probably compiling the regex and putting it in a static field will make it faster, too), and 2 will be smaller than 10 if you compare actual numbers. If you don't parse the numbers, you can skip the whole regex'ing and just do one string comparison.

1
Christopher On

If the number sheme is determinsitic (always a string+ a integer) you could simply store them in two seperate fields (ideally back in the DB. Composed Keys are a thing after all). Then it is "sort by string first, number 2nd".

If it is not, stuff is going to get tough. What you want is the quite unusual Sorting that Windows does with files. Aside from custom regex solutions (but again if it is that determinstic you can propably use 2 or more fields), there would be StrCmpLogicalW. But that one is both unmanaged and it's behavior varries between Windows (it is the "canonical" sorting for the Windows it runs on, but that sorting varried between Windows Versions).