I need to replace diacritic characters (e.g. ä, ó, etc.) with their 'base' character. For most of the characters, this solution works:
StringUtils.stripAccents(tmpStr);
but this misses four characters: æ, œ, ø, and ß.
I took a look at this solution here Is there a way to get rid of accents and convert a whole string to regular letters?. I figured the first solution would work, but it does not.
How can I replace these characters with their 'base' character (e.g. replace æ with a).
The source code says (https://commons.apache.org/proper/commons-lang/apidocs/src-html/org/apache/commons/lang3/StringUtils.html),
It has a comment that says,
// Note that this doesn't correctly remove ligatures...So may be you need to manually replace those instances. Something like,
Diacritical Character to ASCII Character Mapping https://docs.oracle.com/cd/E29584_01/webhelp/mdex_basicDev/src/rbdv_chars_mapping.html