Convert a string to ordinal upper or lower case

3k views Asked by At

Is it possible to convert a string to ordinal upper or lower case. Similar like invariant.

string upperInvariant = "ß".ToUpperInvariant();
string lowerInvariant = "ß".ToLowerInvariant();
bool invariant = upperInvariant == lowerInvariant; // true

string upperOrdinal = "ß".ToUpperOrdinal(); // SS
string lowerOrdinal = "ß".ToLowerOrdinal(); // ss
bool ordinal = upperOrdinal == lowerOrdinal; // false

How to implement ToUpperOrdinal and ToLowerOrdinal?

Edit: How to to get the ordinal string representation? Likewise, how to get the invariant string representation? Maybe that's not possible as in the above case it might be ambiguous, at least for the ordinal representation.

Edit2:

string.Equals("ß", "ss", StringComparison.InvariantCultureIgnoreCase); // true

but

"ß".ToLowerInvariant() == "ss"; // false
2

There are 2 answers

2
NightOwl888 On

I don't believe this functionality exists in the .NET Framework or .NET Core. The closest thing is string.Normalize(), but it is missing the case fold option that you need to successfully pull this off.

This functionality exists in the ICU project (which is available in C/Java). The functionality you are after is the unorm2.h file in C or the Normalizer2 class in Java. Example usage in Java and related test.

There are 2 implementations of Normalizer2 that I am aware of that have been ported to C#:

  • icu-dotnet (a C# wrapper library for ICU4C)
  • ICU4N (a fully managed port of ICU4J)

Full Disclosure: I am a maintainer of ICU4N.

0
James Crosswell On

From msdn:

TheStringComparer returned by the OrdinalIgnoreCase property treats the characters in the strings to compare as if they were converted to uppercase using the conventions of the invariant culture, and then performs a simple byte comparison that is independent of language.

But I'm guessing doing that won't achieve what you want, since simply doing "ß".ToUpperInvariant() won't give you a string that is ordinally equivallent to "ss". There must be some magic in the String.Equals method that handles the speciall case of Why “ss” equals 'ß'.

If you're only worried about German text then this answer might help.