Transliterator transliterate Rules

1.5k views Asked by At

I use this function to transliterate Cyrillic words into Latin:

$string = transliterator_transliterate('Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC;', $name);

However, I get single-letter matches instead of composite ones. That is, I get the word "Финиш" after processing "Finis" and it should be "Finish"

For example (in parentheses is written what should be according to the standard)

ш -> s (sh)
щ -> s (shch)
ч -> c (ch)
.... and other

For example full right table:

а-a б-b в-v г-g д-d е-e ё-e ж-zh з-z и-i й-i к-k л-l м-m н-n о-o п-p р-r
с-s т-t у-u ф-f х-kh ц-ts ч-ch ш-sh щ-shch ы-y ъ-ie э-e ю-iu я-ia 

As I understand it, you need to configure it somewhere in the rules, but I can't Figure out how to do it in the documentation.

Or perhaps there is some other option?

1

There are 1 answers

2
Casimir et Hippolyte On

All you have to do is to write rules for the particular cases:

$str = 'а-a б-b в-v г-g д-d е-e ё-e ж-zh з-z и-i й-i к-k л-l м-m н-n о-o п-p р-r
с-s т-t у-u ф-f х-kh ц-ts ч-ch ш-sh щ-shch ы-y ъ-ie э-e ю-iu я-ia    Финиш';

$rules = <<<'RULES'
:: NFC ;
ё > e; ж > zh; й > i; х > kh; ц > ts; ч > ch; ш > sh; щ > shch; ъ > ie;
э > e; ю > iu; я > ia;
:: Cyrillic-Latin ;
RULES;

$tls = Transliterator::createFromRules($rules);

echo $tls->transliterate($str), PHP_EOL;

Note that the "particular rules" have to be before the general rule (Cyrillic-Latin).