I have problem with jdom2 XPath:
test.xhtml code:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="cs" lang="cs">
<head>
<title>mypage</title>
</head>
<body>
<div class="in">
<a class="nextpage" href="url.html">
<img src="img/url.gif" alt="to url.html" />
</a>
</div>
</body>
</html>
Java code:
Document document;
SAXBuilder saxBuilder = new SAXBuilder();
document = saxBuilder.build("test2.html");
XPathFactory xpfac = XPathFactory.instance();
XPathExpression<Element> xp = xpfac.compile("//a[@class = 'nextpage']", Filters.element());
for (Element att : xp.evaluate(document) ) {
System.out.println("We have target " + att.getAttributeValue("href"));
}
But just with this I can't get any element. I found that when query is //*[@class = 'nextpage']
, it finds it.
We have target url.html
It must be something with namespace or anything other in header because without it it can generate some output. I don't know what I'm doing wrong.
Note: Alkthough this is the same issue as described in the suggested duplicate, that other question relates to JDOM versions 1.x. In JDOM 2.x there are a number of significant differences. This answer relates to JDOM 2.x XPath implementation which is significantly different.
The XPath specification is very clear about how namespaces are treated in XPath expressions. Unfortunately, for people familiar with XML, the XPath handling for Namespaces is slightly different than their expectations. This is the specification:
In practice, what this means, is that any time you have a 'default' namespace in your XML document, you still need to prefix that namespace when using it in an XPath expression. The XPathFactory.compile(...) method alludes to this requirement in the JavaDoc, but it is not as clear as it should be. The prefix you use is arbitrary, and local to that XPath expression only. In your case, the code will look something like (assuming we choose the namespace
xhtml
for the URIhttp://www.w3.org/1999/xhtml
):I should add this to the FAQ... Thanks.