In Unicode Locale Data Markup Language(LDML), since version 24, the element and its sub-elements is deprecated. But the MySQL example still uses deprecated element.
The collation defined when I added to MySQL Collation with a latest version of the CLDR Collation definition marked with the element did not take effect.
I want to add to MySQL collation for the UTF8 character using stroke collation in <zh.xml>.
MySQL Path: mysql-8.0.28-winx64\share\charsets\index.xml
MySQL version: 8.0.28
Stroke collation in: https://github.com/unicode-org/cldr/blob/main/common/collation/zh.xml
http://www.unicode.org/reports/tr35/#Element_rules
https://dev.mysql.com/doc/mysql-g11n-excerpt/8.0/en/ldml-rules.html
How to repeat
Step 1. Edit mysql-8.0.28-winx64\share\charsets\index.html Add some element(collation content copy from CLDR collation zh.xml) like:
<charset name="utf8mb4">
<family>Unicode</family>
<description>UTF-8 Unicode</description>
<collation name="utf8mb4_stroke_ci" id="1030" type='stroke'>
<cr><![CDATA[
[import zh-u-co-private-pinyin]
...more data...
]]></cr>
</collation>
</charset>
Step 2. Restart mysql server
Step 3. Check collation added success
mysql> SHOW COLLATION WHERE Collation = 'utf8mb4_stroke_ci';
+----------------+---------+------+---------+----------+---------+---------------+
| Collation | Charset | Id | Default | Compiled | Sortlen | Pad_attribute |
+----------------+---------+------+---------+----------+---------+---------------+
| utf8mb4_stroke_ci | utf8 | 1030 | | | 8 | PAD SPACE |
+----------------+---------+------+---------+----------+---------+---------------+
1 row in set (0.00 sec)
Step 4. Create a database and table then insert some data
mysql> create database collation_test;
Query OK, 1 row affected (0.02 sec)
mysql> use collation_test;
Database changed
mysql> SET NAMES utf8mb4 COLLATE utf8mb4_stroke_ci;
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TABLE member_stroke (
-> name VARCHAR(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_stroke_ci
-> );
Query OK, 0 rows affected (0.05 sec)
mysql> insert into member_stroke values('一'); -- character '一' means '1', stroke 1.
Query OK, 1 row affected (0.01 sec)
mysql> insert into member_stroke values('二'); -- character '一' means '2', stroke 2.
Query OK, 1 row affected (0.01 sec)
mysql> insert into member_stroke values('三'); -- character '一' means '3', stroke 3.
Query OK, 1 row affected (0.01 sec)
Step 4. Select data and order by name
mysql> select * from member_stroke order by name;
+------+
| name |
+------+
| 一 |
| 三 |
| 二 |
+------+
3 rows in set (0.00 sec)
Expect result
+------+
| name |
+------+
| 一 |
| 二 |
| 三 |
+------+
Additional information
When I use the element to define collation, it success! But is`s deprecated at LDML(version 24) on 2013-09-18.
<charset name="utf8mb4">
<family>Unicode</family>
<description>UTF-8 Unicode</description>
<collation name="utf8mb4_stroke_ci" id="1030" type='stroke' alt='short'>
<rules>
<!-- START AUTOGENERATED STROKE SHORT -->
<reset><last_non_ignorable /></reset>
<p>⠁</p><!-- INDEX 1 -->
<pc>一</pc><!-- 1 -->
<p>⠁</p><!-- INDEX 2 -->
<pc>二</pc><!-- 2 -->
<p>⠁</p><!-- INDEX 3 -->
<pc>三</pc><!-- 3 -->
</rules>
</collation>
</charset>
mysql> select * from member_stroke order by name;
+------+
| name |
+------+
| 一 |
| 二 |
| 三 |
+------+
3 rows in set (0.00 sec)