How to parse an XMLDocument in namespace neutral way using JDOM

Question

How to parse an XMLDocument in namespace neutral way using JDOM

207 views Asked by feroze At 04 December 2013 at 00:24

I am trying to parse a document using Dom4J. This document comes from various providers, and sometimes comes with namespaces and sometimes without.

For eg:

<book>
   <author>john</author>
   <publisher>
     <name>John Q</name>
   </publisher>
</book>

or

<book xmlns="http://schemas.xml.com/XMLSchemaInstance">
   <author>john</author>
   <publisher>
     <name>John Q</name>
   </publisher>
</book>

or

<book xmlns:i="http://schemas.xml.com/XMLSchemaInstance">
   <i:author>john</i:author>
   <i:publisher>
     <i:name>John Q</i:name>
   </i:publisher>
</book>

I have a list of XPaths. I parse the document into a Document class, and then search on it using the xpaths.

        Document doc = parseDocument(documentFile);
        List<String> XmlPaths = new List<String>();
        XmlPaths.add("book/author");
        XmlPaths.add("book/publisher/name");

        for (int i = 0; i < XmlPaths.size(); i++)
        {
            String searchPath = XmlPaths.get(i);

            Node currentNode = doc.selectSingleNode(searchPath);
            assert(currentNode != null);
        }

This code does not work on the last document, the one that is using namespace prefixes.

I tried these techniques, but none of them seem to work.

1) changing the last element in the xpath to be namespace neutral:

/book/:author
/book/[local-name()='author']
/[local-name()='book']/[local-name()='author']

All of these throw an exception saying that the XPATH format is not correct.

2) Adding namespace uris to the XPAth, after creating it using DocumentHelper.createXPath();

Any idea what I am doing wrong?

FYI I am using dom4j version 1.5

Original Q&A

There are 1 answers

**Marcus Rickert** · Accepted Answer · 2013-12-04T00:57:01+00:00

Your XPath does not contain a tag name. The general syntax in your case would be

/TAGNAMEPARENT[CONDITION_PARENT]/TAGNAMECHILD[CONDITION_CHILD]

The important aspect is that the tag names are mandatory while the conditions are optional. If you do not want to specify a tag name you have use * for "any tag". There may be performance implications for large XML files since you will always have to iterate over a node set instead of using an index lookup. Maybe @MichaelKay can comment on this.

Try this instead:

/*[local-name()='book']/*[local-name()='author']

TechQA.

How to parse an XMLDocument in namespace neutral way using JDOM

There are 1 answers

Related Questions in XPATH

Related Questions in XML-PARSING

Related Questions in XML-NAMESPACES

Related Questions in DOM4J

Popular Questions

Trending Questions