Why does RLIKE match á and è in binary charset in MariaDB 10.2

35 views Asked by At

DB: MariaDB 10.2.13, 10.3.39.

Operator RLIKE return true when checking if in my binary string á exist char è:

convert('á' using binary) RLIKE '[è]': 1

Query:

([email protected]:3306) [test]> select char(50081), char(50088), hex(char(50081)), hex('á'), char(50081) RLIKE '[è]', char(50088) RLIKE '[á]', convert(char(50088) using utf8) RLIKE '[á]', convert(char(50088) using utf8mb4) RLIKE '[á]', convert('á' using binary) RLIKE '[è]', 'á' RLIKE '[è]' \G
*************************** 1. row ***************************
                                    char(50081): á
                                    char(50088): è
                               hex(char(50081)): C3A1
                                      hex('á'): C3A1
                       char(50081) RLIKE '[è]': 1
                       char(50088) RLIKE '[á]': 1
   convert(char(50088) using utf8) RLIKE '[á]': 0
convert(char(50088) using utf8mb4) RLIKE '[á]': 0
        convert('á' using binary) RLIKE '[è]': 1
                              'á' RLIKE '[è]': 0
1 row in set (0,00 sec)
1

There are 1 answers

0
Paul Spiegel On BEST ANSWER

Look at the HEX codes of 'á' and 'è', which are 'C3A1' and 'C3A8'.

select hex('á'), hex('è');

In binary context values are compared byte by byte. Which makes your query equivalent to

select concat(0xC3, 0xA1) RLIKE concat('[', 0xC3, 0xA8, ']');

The byte 'C3' is contained in 'C3A1', thus the match succeeds.