How can we save emojis in SQL database to preserve their uniqueness for equality comparisons?

648 views Asked by At

We have an issue when we save emojis to our database.

We have altered our table column to a character set and collation that should save enough bytes to distinguish between emojis.

This was our query to modify the table column:

ALTER TABLE TableName MODIFY TableColumn VARCHAR(2048) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

When we save the following emojis, the binary values are as follows:

 ♥️  | e2 99 a5 ef b8 8f
 | f0 9f 90 92
 | f0 9f 90 b5

Some emojis (♥️) save with binary information that allows us to distinguish them as unique/separate emojis.

Since we do not want to save the same emoji twice, for our use case, we are checking for the 'value' in our database. If the value exists, we do not save the emoji.

However, many emoji values, including and , are treated as the same value.

Therefore, when we attempt to save and , only one gets saved...

Why are they treated as the same value?

How can we save and as identifiably different emojis, using SQL?

Is there a better char set or collation we could use?

We are accessing a MySql and MSSql databases via separate repositories, written in C# in a .NET framework project.

1

There are 1 answers

1
FrankPl On

Actually, your code for the sitting monkey is wrong! That would be f0 9f 90 92, not f0 9f 90 b5. See https://www.fileformat.info/info/unicode/char/1f412/index.htm, the UTF-8 hex code. The other two codes seem correct.

Whatever you did, the error is not on the side of the code you showed.