I have a varchar column and i want to replace all diacritics with normal letters
For example:
- In:
São PauloOut:Sao Paulo - In:
eéíãçOut:eeiac
I have a varchar column and i want to replace all diacritics with normal letters
For example:
São Paulo Out: Sao Pauloeéíãç Out: eeiac
A diacritical character is a composite character, i.e. can be a base char plus a diacritic, e.g.
Both
006100B4and00E1result in the same character, Unicode allows to switch back and forth using normalization functions, which are supported by Teradata:decomposes a composite character into separate characters. Those Combining Diacritical Marks are in a Unicode block ranging from U+0300 to U+036F.
Now decompose the input and apply a Regular Expression to remove characters from this range:
returns
If there are other decomposable characters you might need to compose them again to save space using another
translate(... using UNICODE_TO_UNICODE_NFC)If you input string has a LATIN charset it might be easier to find the limited list of diacritical characters and apply translate: