XML fix namespace declaration

1.4k views Asked by At

I am trying to detetct/work around this bug in RSS elements. That means I have to find a wrong namespace-declaration and change its value to the correct namespace. E.g:

xmlns:media="http://search.yahoo.com/mrss" 

must be:

xmlns:media="http://search.yahoo.com/mrss/" 

How can I achive that given a org.w3c.Document?

I meanwile found out how to get all elements of a certain namespace:

        XPathFactory xpf = XPathFactory.newInstance();
        XPath xpath = xpf.newXPath();
        XPathExpression expr = xpath.compile("//*[namespace-uri()='http://search.yahoo.com/mrss']");


        Object result = expr.evaluate(d, XPathConstants.NODESET);
        if (result != null) {
            NodeList nodes = (NodeList) result;
            for(int node=0;node<nodes.getLength();node++)
            {
                Node n = nodes.item(node);
                this.log.warn("Found old mediaRSS namespace declaration: "+n.getTextContent());
            }

        } 

So now I have to figure out how to change the namespace of a Node via JAXP.

2

There are 2 answers

0
er4z0r On BEST ANSWER

Just for the sake of completeness:

Java Code:

Document d = out.outputW3CDom(converted);
            DOMSource oldDocument = new DOMSource(d);
            DOMResult newDocument = new DOMResult();
            TransformerFactory tf = TransformerFactory.newInstance();
            StreamSource xsltsource = new StreamSource(
                    getStream(MEDIA_RSS_TRANSFORM_XSL));
            Transformer transformer = tf.newTransformer(xsltsource);
            transformer.transform(oldDocument, newDocument);

private InputStream getStream(String fileName) {
    InputStream xslStream = Thread.currentThread().getContextClassLoader()
                .getResourceAsStream("/" + fileName);
    if (xslStream == null) {
        xslStream = Thread.currentThread().getContextClassLoader()      .getResourceAsStream(fileName);
        }
        return xslStream;
    }

Stylesheet:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <!--identity transform that will copy matched node/attribute to the output and apply templates for it's children and attached attributes-->
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="@*|*|text()" />
        </xsl:copy>
    </xsl:template>

    <!--Specialized template to match on elements with the incorrect namespace and generate a new element-->
    <xsl:template match="//*[namespace-uri()='http://search.yahoo.com/mrss']">
        <xsl:element name="{local-name()}" namespace="http://search.yahoo.com/mrss/" >
            <xsl:apply-templates select="@*|*|text()" />
        </xsl:element>
    </xsl:template>
</xsl:stylesheet>

Special thanks to Mads Hansen for his help with the XSLT.

10
Chris Lercher On

You could probably do it with XSLT, with a rule like this:

<xsl:template match="media:*">
   <xsl:element name="local-name()" namespace="http://search.yahoo.com/mrss/">
      <xsl:apply-templates match="node()|@*"/>
   </xsl:element>
</xsl:template>

where media is bound to "http://search.yahoo.com/mrss".

You may have to tweak the syntax a little, as I'm writing this without the help of a compiler. Also, what you'll get is probably not extremely nicely formatted (namespace declarations on many elements), but it should be locically correct.