I'm trying to normalize strings with characters like 'áéíóú' to 'aeiou' to simplify searches.
Following the response to this question I should use the Normalizer
class to do it.
The problem is that the normalize
function does nothing. For example, that code:
<?php echo 'Pérez, NFC: ' . normalizer_normalize('Pérez', Normalizer::NFC)
. ' NFD: ' .normalizer_normalize('Pérez', Normalizer::NFD)
. ' NFKC: ' .normalizer_normalize('Pérez', Normalizer::NFKC)
. ' NFKD: ' .normalizer_normalize('Pérez', Normalizer::NFKD)?>
<br/>
<?php echo 'aáàä, êëéè,'
. ' FORM_C: ' . normalizer_normalize('aáàä, êëéè', Normalizer::FORM_C )
. ' FORM_D: ' .normalizer_normalize('aáàä, êëéè', Normalizer::FORM_D)
. ' FORM_KC: ' .normalizer_normalize('aáàä, êëéè', Normalizer::FORM_KC)
. ' FORM_KD: ' .normalizer_normalize('aáàä, êëéè', Normalizer::FORM_KD)?>
shows:
Pérez, NFC: Pérez NFD: Pérez NFKC: Pérez NFKD: Pérez
aáàä, êëéè, FORM_C: aáàä, êëéè FORM_D: aáàä, êëéè FORM_KC: aáàä, êëéè FORM_KD: aáàä, êëéè
What is supposed normalize must do?
---EDITED---
It is stranger. When copy and paste the result from web browser, while in editor and original page I can see:
FORM_D: aáàä, êëéè
in the stackoverflow question page I can see (just in Code Sample mode):
FORM_D: aáàä, êëéè
Found on this page: (the linked document has different wording, the old one never exists anymore)
So eliminating accents (and similar) is not the purpose of
Normalizer
.