Normalize a Unicode string to get its canonical representation

964 views Asked by At

Given that for example "à" (one Unicode character) can also be encoded as "\u0300a" (two Unicode characters, i.e. a combining grave accent (U+0300) followed by an a), is there functionality in .NET to normalize a string so that the latter is converted into the former?

I believe the former is deemed the canonical representation. My particular issue is that I've seen cases where the latter isn't displayed correctly by some browsers, but this could be useful in other scenarios too.

1

There are 1 answers

3
Clafou On

Just found it, duh! String.Normalize