XSL transform an xml with character entities in element names

603 views Asked by At

My xml looks like:

<record>
    <name>ABC</name>
    <address>
        &lt;street&gt;sss&lt;/street&gt;
        &lt;city&gt;ccc&lt;/city&gt;
        &lt;state&gt;ttt&lt;/state&gt;
    </address>
</record>

I am trying to read the element 'street' using the xsl:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output omit-xml-declaration="yes" indent="yes" />
    <xsl:template match="/">
        <xsl:value-of select="record/address/street" />
    </xsl:template>
</xsl:stylesheet>

but it doesn't give any output.

Why does this happen even though the input xml is in a valid xml format? So how to transform xml files containing character entities for element names?

3

There are 3 answers

0
michael.hor257k On BEST ANSWER

To add to Michael Kay's answer:

If you start by processing your XML using:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<!-- identity transform -->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="address">
    <xsl:copy>
        <xsl:value-of select="." disable-output-escaping="yes"/>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>

and save the result to file, you will then be able to use your stylesheet to process the resulting file and get the expected result.

0
Sam On
    <xsl:template match="//name"/>
<xsl:template match="record/address">
    <xsl:value-of select="substring-before(., '&lt;city&gt;ccc&lt;/city&gt;')" disable-output-escaping="yes"/>
</xsl:template>

check this code.

1
Michael Kay On

There is no street element. If it were written <street>...</street> then it would be an element, but the angle brackets have been carefully escaped to indicate that it should be treated as plain text.

Converting plain text containing angle brackets into an XML node structure involves parsing; that is, you need to execute a second parse on the text content of the address element. This is complicated by the fact that what you have here is an XML fragment and not a complete XML document.

In XSLT 3.0 you can achieve this using the parse-xml-fragment() function. In earlier releases you may be able to achieve it by calling out to custom extension functions, or (as @sandeepkamboj suggests) by writing a simple XML parser in XSLT (to do that you will need to be confident that you know what subset of XML constructs you need to handle).

Perhaps the best approach is to find out why someone has generated this ridiculous document, and get them to mend their ways.