I have a rather large XSLT template which contains bilingual text (national characters in UTF-8). I am looking for a function that will recode all CDATA
elements inside to use XML #
entities, allowing me to store the XSLT as plain US-ASCII
encoding.
Here is a basic example:
<?xml version="1.0" encoding="UTF-8"?>
<test>Soirée</test>
where é
is encoded as C3 A9
. The desired output would be
<?xml version="1.0" encoding="US-ASCII"?>
<test>Soirée</test>
where é
corresponds to the codepoint for 'LATIN SMALL LETTER E WITH ACUTE' (U+00E9)
. Changing the encoding preamble on the first example results in an error as the UTF-8 bytes become invalid.
Is there a simple way to do this or do I have to resort to a macro?