I'm translating user-submitted strings from UTF-8 to ASCII-Printable:
$str = 'Thê qúïck brõwn fõx júmps? Óvér thé lázy dõg?';
$out = iconv('UTF-8', 'ASCII//TRANSLIT', $str);
var_dump($out);
$out = 'The quick ? brown fox jumps?? Over the lazy dog??';
I want the extra ? question marks from $out removed.
if ($out !== $str && strpos($out, '?') !== false) {
// The input string was modified and contains at least one question mark
//
// Not even really sure where to begin
//
// Do we need to compare the position of every character from the
// original string to every position of the new string and replace
// where the original string did not contain a question mark?
//
// That's all I can think of, but there has to be a better way.
}
I want to keep all //TRANSLIT characters, including those few included in the example $str above, e.g.áéïõú = aeiou. There is no other nuace to this question. I think it boils down to a string comparison and replace question.
I'm not necessarily looking for someone to write the entire code, just a pointer in the right direction of how you'd tackle this.
Here is a solution based on
transliterator_transliterate():Output:
Note that the emoji are kept by
transliterator_transliterate(), so I used a regex to remove all the remaining non-ASCII characters.