Problem
I'm using Saxon-EE 11 and my platform's language is en-us.
I'm attempting to implement custom sorting behavior for an <xsl:sort> instruction by specifying a UCA collation. Ignoring the XML document details and just getting to the core, string-by-string comparison question, I want these strings:
ABSENTEES
ABSENTEE VOTING
MINNEAPOLIS TEACHERS RETIREMENT FUND ASSOCIATION (MTRFA)
MINNEAPOLIS-SAINT PAUL INTERNATIONAL AIRPORT
MINNEAPOLIS/SAINT PAUL HOUSING FINANCE BOARD
MINNEAPOLIS
MINNEAPOLIS PORT AUTHORITY
to be sorted into this order:
ABSENTEE VOTING
ABSENTEES
MINNEAPOLIS
MINNEAPOLIS PORT AUTHORITY
MINNEAPOLIS/SAINT PAUL HOUSING FINANCE BOARD
MINNEAPOLIS-SAINT PAUL INTERNATIONAL AIRPORT
MINNEAPOLIS TEACHERS RETIREMENT FUND ASSOCIATION (MTRFA)
Attempting to render the rules into English:
- A string that shares a common prefix with another string, but diverges at a space should sort before that other string (
ABSENTEE VOTINGbeforeABSENTEES) - Hyphens and slashes should be considered the same as spaces.
What I've tried
The UCA collation http://www.w3.org/2013/collation/UCA?alternate=shifted handles the MINNEAPOLIS* strings correctly, but it will put ABSENTEES before ABSENTEE VOTING.
The bare UCA collation http://www.w3.org/2013/collation/UCA handles ABSENTEES and ABSENTEE VOTING correctly, but will place the MINNEAPOLIS/SAINT PAUL and MINNEAPOLIS-SAINT PAUL strings after anything with MINNEAPOLIS and a space character.
I've attempted a few other combinations of parameters, though none of them has produced anything closer to what I'm looking for. I'm close to giving up and implementing either a custom pre-processing before applying the collation or else dropping into a Java implementation.
If what I'm looking for is truly not achievable with UCA collations, that's good to know.
Using an input of:
XML
and the following stylesheet:
XSLT 2.0
I get:
Result