MariaDB: Insert fails due to random Unicode codepoint

73 views Asked by At

I’m using MariaDB 10.6.16, but reproduced the same behaviour with 10.3.39, too. My problem is this little SQL snippet:

CREATE TABLE problem (
    text   TEXT
) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

INSERT INTO problem (text) VALUES ('');

This results in the error message:

ERROR 1366 (22007) at line 5: Incorrect string value: '\xF0\xA4\x8B\xAE' for column `problem`.`text` at row 1

The bytes \xF0\xA4\x8B\xAE represent the character exactly, so the problem is that MariaDB doesn’t allow inserting this specific character.

What I’ve tried:

  • checking the character. It’s U+242EE, an ancient Unicode character defined in 2001. So it shouldn’t be a problem with too new a code point for the collation.
  • checking two MariaDB versions, as noted above.
  • removing the collation and charset from the table definition. No change.
  • using other 4 byte characters in the SQL query. They work just fine, it’s only this (and possibly some other random CJK characters, too).

How can I convince MariaDB to insert this character into my DB?

1

There are 1 answers

0
danblack On BEST ANSWER

Getting the consistent character set for CHARACTER_SET_RESULTS, CHARACTER_SET_CONNECTION and CHARACTER_SET_CLIENT is required to use non-default character sets.

To do this all in one statement, use SET NAMES, in this case:

SET NAMES utf8mb4

Will set all these client related system variables to a character set that supports the '' character.

The collation is automaticity set to the default for the character set, though it can be explicit with set names too.