I want to get element path while parsing XML using java StAX2 parser. How to get information about the current element path?
<root>
<a><b>x</b></a>
</root>
In this example the path is /root/a/b
.
"The chronicler's duty"
try (InputStream in = new ByteArrayInputStream(xml.getBytes())) {
final XMLInputFactory2 factory = (XMLInputFactory2) XMLInputFactory.newInstance();
final XMLStreamReader2 reader = (XMLStreamReader2) factory.createXMLStreamReader(in);
Stack<String> pathStack = new Stack<>();
while (reader.hasNext()) {
reader.next();
if (reader.isStartElement()) {
pathStack.push(reader.getLocalName());
processPath('/' + String.join("/", pathStack));
} else if (reader.isEndElement()) {
pathStack.pop();
}
}
}
InputElementStack
Implementing adapter to access InputElementStack
, its protected mCurrElement
and interate parents (this slows down algoritm).
package com.ctc.wstx.sr;
import java.util.LinkedList;
public class StackUglyAdapter {
public static String PATH_SEPARATOR = "/";
private InputElementStack stack;
public StackUglyAdapter(InputElementStack stack) {
this.stack = stack;
}
public String getCurrElementLocalName() {
return this.stack.mCurrElement.mLocalName;
}
public String getCurrElementPath() {
LinkedList<String> list = new LinkedList<String>();
Element el = this.stack.mCurrElement;
while (el != null) {
list.addFirst(el.mLocalName);
el = el.mParent;
}
return PATH_SEPARATOR+String.join(PATH_SEPARATOR,list);
}
}
example of use:
try (final InputStream in = new ByteArrayInputStream(xml.getBytes())) {
final XMLInputFactory2 factory =
(XMLInputFactory2) XMLInputFactory.newInstance();
final XMLStreamReader2 reader =
(XMLStreamReader2) factory.createXMLStreamReader(in);
final StackUglyAdapter stackAdapter =
new StackUglyAdapter(((StreamReaderImpl) reader).getInputElementStack());
while (reader.hasNext()) {
reader.next();
if (reader.isStartElement()) {
processPath(stackAdapter.getCurrElementPath());
}
}
}
Method 1 with dedicated stack is better, because is API implementation-independent and is just as fast as the Method 2.
Keep a stack. Push the element name on START_ELEMENT and pop it on END_ELEMENT.
Here's a short example. It does nothing other than print the path of the element being processed.