I want to unmarshall part of a large XML file. There exists solution of this already, but I want to improve it for my own implementation.
Please have a look at the following code: (source)
public static void main(String[] args) throws Exception {
XMLInputFactory xif = XMLInputFactory.newFactory();
StreamSource xml = new StreamSource("input.xml");
XMLStreamReader xsr = xif.createXMLStreamReader(xml);
xsr.nextTag();
while(!xsr.getLocalName().equals("VersionList")&&xsr.getElementText().equals("1.81")) {
xsr.nextTag();
}
I want to unmarshall the input.xml (given below) for the node: versionNumber="1.81"
With the current code, the XMLStreamReader will first check the node versionNumber="1.80" and then it will check all sub nodes of versionNumber and then it will again move to node: versionNumber="1.81", where it will satisfy the exit condition of the while loop.
Since, I want to check node versionNumber only, iterating its subnodes are unnecessary and for large xml file, iterating all sub nodes of version 1.80 will take lone time. I want to check only root nodes (versionNumber) and if the first root node (versionNumber=1.80) is not matched, the XMLStreamReader should directly jump to next root node ((versionNumber=1.81)). But it seems not achievable with xsr.nextTag(). Is there any way, to iterate through the desired root nodes only?
input.xml:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<fileVersionListWrapper FileName="src.h">
<VersionList versionNumber="1.80">
<Reviewed>
<commentId>v1.80(c5)</commentId>
<author>Robin</author>
<lines>47</lines>
<lines>48</lines>
<lines>49</lines>
</Reviewed>
<Reviewed>
<commentId>v1.80(c6)</commentId>
<author>Sujan</author>
<lines>82</lines>
<lines>83</lines>
<lines>84</lines>
<lines>85</lines>
</Reviewed>
</VersionList>
<VersionList versionNumber="1.81">
<Reviewed>
<commentId>v1.81(c4)</commentId>
<author>Robin</author>
<lines>47</lines>
<lines>48</lines>
<lines>49</lines>
</Reviewed>
<Reviewed>
<commentId>v1.81(c5)</commentId>
<author>Sujan</author>
<lines>82</lines>
<lines>83</lines>
<lines>84</lines>
<lines>85</lines>
</Reviewed>
</VersionList>
</fileVersionListWrapper>
You can get the node from the xml using XPATH
XPath, the XML Path Language, is a query language for selecting nodes from an XML document. In addition, XPath may be used to compute values (e.g., strings, numbers, or Boolean values) from the content of an XML document. What is Xpath.
Your XPath expression will be
meaning you want to only return VersionList where the attribute is 1.81
JAVA Code
I have made an assumption that you have the xml as string so you will need the following idea
Now it will be simply loop through each node
to get the nodes back to to xml you will have to create a new Document and append the nodes to it.
once you have the new document you will then run a serializer to get the xml.
now that you have your String xml , I have made an assumption you already have a Jaxb object which looks similar to this
Which you will simple use the Jaxb unmarshaller to create the objects for you