Given:
str1 = "é" # Latin accent
str2 = "囧" # Chinese character
str3 = "ジ" # Japanese character
str4 = "e" # English character
How to differentiate str1
(Latin accent characters) from rest of the strings?
Update:
Given
str1 = "\xE9" # Latin accent é actually stored as \xE9 reading from a file
How would the answer be different?
I would first strip out all plain ASCII characters with
gsub
, and then check with a regex to see if any Latin characters remain. This should detect the accented latin characters.