I want to use the SAX parser for a large XML file. The handler looks like this:
DefaultHandler handler = new DefaultHandler() {
String temp;
HashSet < String > xml_Elements = new LinkedHashSet < String > ();
HashMap < String, Boolean > xml_Tags = new LinkedHashMap < String, Boolean > ();
HashMap < String, ArrayList < String >> tags_Value = new LinkedHashMap < String, ArrayList < String >> ();
// ### startElement #######
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
xml_Elements.add(qName);
for (String tag: xml_Elements) {
if (qName == tag) {
xml_Tags.put(qName, true);
}
}
}
// ########### Characters ###########
public void characters(char ch[], int start, int length) throws SAXException {
temp = new String(ch, start, length);
}
// ########### endElement ############
public void endElement(String uri, String localName,
String qName) throws SAXException {
if (xml_Tags.get(qName) == true) {
if (tags_Value.containsKey(qName)) {
tags_Value.get(qName).add(temp);
tags_Value.put(qName, tags_Value.get(qName));
}
else {
ArrayList < String > tempList = new ArrayList < String > ();
tempList.add(temp);
//tags_Value.put(qName, new ArrayList<String>());
tags_Value.put(qName, tempList);
}
//documentWriter.write(qName + ":" + temp + "\t");
for (String a: tags_Value.keySet()) {
try {
documentWriter.write(tags_Value.get(a) + "\t");
}
catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
xml_Tags.put(qName, false);
}
tags_Value.clear();
}
};
My XML is like:
<TermInfo>
<A>1/f noise</A>
<B>Random noise</B>
<C>Accepted</C>
<D>Flicker noise</D>
<F>Pink noise</F>
<I>1-f</I>
<I>1/f</I>
<I>1/f noise</I>
<I>1:f</I>
<I>flicker noise</I>
<I>noise</I>
<I>pink noise</I>
<ID>1</ID>
</TermInfo>
<TermInfo>
<A>3D printing</A>
<B>Materials fabrication</B>
<C>Accepted</C>
<D>3d printing</D>
<F>2</F>
<I>three dimension*</I>
<I>three-dimension*</I>
<I>3d</I>
<I>3-d</I>
<I>3d*</I>
</TermInfo>
I wanted to cluster all nested tags under Tag A. I.e., for each A, its B,C,D and I together, etc. But using the above handler the output is like A-B-C-D-I-I-etc. Can I make one object for each A and add other elements into it? How can I include this?
I think this is along the lines of what you are asking for. It creates a List of HashMap objects. Every time it starts a TermInfo, it creates a new HashMap. Each endElement inside TermInfo puts a value into the Map. When endElement is TermInfo, it sets fieldMap to null so no intermediate tags are added. "TermInfo" represents A from your description.