Problem
I'm using Saxon-EE 11 and my platform's language is en-us
.
I'm attempting to implement custom sorting behavior for an <xsl:sort>
instruction by specifying a UCA collation. Ignoring the XML document details and just getting to the core, string-by-string comparison question, I want these strings:
ABSENTEES
ABSENTEE VOTING
MINNEAPOLIS TEACHERS RETIREMENT FUND ASSOCIATION (MTRFA)
MINNEAPOLIS-SAINT PAUL INTERNATIONAL AIRPORT
MINNEAPOLIS/SAINT PAUL HOUSING FINANCE BOARD
MINNEAPOLIS
MINNEAPOLIS PORT AUTHORITY
to be sorted into this order:
ABSENTEE VOTING
ABSENTEES
MINNEAPOLIS
MINNEAPOLIS PORT AUTHORITY
MINNEAPOLIS/SAINT PAUL HOUSING FINANCE BOARD
MINNEAPOLIS-SAINT PAUL INTERNATIONAL AIRPORT
MINNEAPOLIS TEACHERS RETIREMENT FUND ASSOCIATION (MTRFA)
Attempting to render the rules into English:
- A string that shares a common prefix with another string, but diverges at a space should sort before that other string (
ABSENTEE VOTING
beforeABSENTEES
) - Hyphens and slashes should be considered the same as spaces.
What I've tried
The UCA collation http://www.w3.org/2013/collation/UCA?alternate=shifted
handles the MINNEAPOLIS*
strings correctly, but it will put ABSENTEES
before ABSENTEE VOTING
.
The bare UCA collation http://www.w3.org/2013/collation/UCA
handles ABSENTEES
and ABSENTEE VOTING
correctly, but will place the MINNEAPOLIS/SAINT PAUL
and MINNEAPOLIS-SAINT PAUL
strings after anything with MINNEAPOLIS
and a space character.
I've attempted a few other combinations of parameters, though none of them has produced anything closer to what I'm looking for. I'm close to giving up and implementing either a custom pre-processing before applying the collation or else dropping into a Java implementation.
If what I'm looking for is truly not achievable with UCA collations, that's good to know.
Using an input of:
XML
and the following stylesheet:
XSLT 2.0
I get:
Result