In Java I need to convert an XML Document to a string, with non-printable characters in data as hex

223 views Asked by At

I have a method that takes a Document and produces an XML String value. It works fine, except that spaces, tabs, and other characters like that are preserved as-is in the node values. I need them converted to their hex equivalents.

Here's the method I have:

public static String docToXML( Document doc )
{
    try 
    {
        StringWriter sw = new StringWriter();
        TransformerFactory tf = TransformerFactory.newInstance();
        Transformer transformer = tf.newTransformer();
        transformer.setOutputProperty(OutputKeys.METHOD, "xml");
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");

        transformer.transform(new DOMSource(doc), new StreamResult(sw));
        return sw.toString();
    } 
    catch (Exception ex) 
    {
        throw new RuntimeException("Error converting to String", ex);
    }       
}

Even if the value is entered into the document in hex form, it is converted to a space or tab as it's converted to a String.

Does anyone know how to make this happen? I'm assuming it's an Output Property, but I haven't found one.

EDIT:

The current behavior is something like this (for a space):

<MyField> </MyField>

The desired behavior is:

<MyField>&#x20;</MyField>

1

There are 1 answers

0
Michael Kay On

With XSLT 2.0 you can use character maps to achieve this:

<xsl:character-map>
  <xsl:output-character character=" " string="&amp;#x20;"/>
  <xsl:output-character character="&#9;" string="&amp;#x09;"/>
  ...
</xsl:character-map>