SAX parsing - mapping nested tags into the main tag

216 views Asked by At

I want to use the SAX parser for a large XML file. The handler looks like this:

DefaultHandler handler = new DefaultHandler() {
  String temp;
  HashSet < String > xml_Elements = new LinkedHashSet < String > ();
  HashMap < String, Boolean > xml_Tags = new LinkedHashMap < String, Boolean > ();
  HashMap < String, ArrayList < String >> tags_Value = new LinkedHashMap < String, ArrayList < String >> ();

  // ### startElement #######
  public void startElement(String uri, String localName, String qName,
    Attributes attributes) throws SAXException {
    xml_Elements.add(qName);


    for (String tag: xml_Elements) {
      if (qName == tag) {
        xml_Tags.put(qName, true);
      }
    }
  }

  // ########### Characters ###########
  public void characters(char ch[], int start, int length) throws SAXException {

    temp = new String(ch, start, length);
  }

  // ########### endElement ############
  public void endElement(String uri, String localName,
    String qName) throws SAXException {

    if (xml_Tags.get(qName) == true) {
      if (tags_Value.containsKey(qName)) {
        tags_Value.get(qName).add(temp);
        tags_Value.put(qName, tags_Value.get(qName));
      }
      else {
        ArrayList < String > tempList = new ArrayList < String > ();
        tempList.add(temp);
        //tags_Value.put(qName, new ArrayList<String>());
        tags_Value.put(qName, tempList);
      }
      //documentWriter.write(qName + ":" + temp + "\t");
      for (String a: tags_Value.keySet()) {
        try {
          documentWriter.write(tags_Value.get(a) + "\t");
        }
        catch (IOException e) {
          // TODO Auto-generated catch block
          e.printStackTrace();
        }
      }
      xml_Tags.put(qName, false);
    }
    tags_Value.clear();
  }
};

My XML is like:

<TermInfo>
    <A>1/f noise</A>
    <B>Random noise</B>
    <C>Accepted</C>
    <D>Flicker noise</D>
    <F>Pink noise</F>
    <I>1-f</I>
    <I>1/f</I>
    <I>1/f noise</I>
    <I>1:f</I>
    <I>flicker noise</I>
    <I>noise</I>
    <I>pink noise</I>
    <ID>1</ID>
</TermInfo>
<TermInfo>
    <A>3D printing</A>
    <B>Materials fabrication</B>
    <C>Accepted</C>
    <D>3d printing</D>
    <F>2</F>
    <I>three dimension*</I>
    <I>three-dimension*</I>
    <I>3d</I>
    <I>3-d</I>
    <I>3d*</I>
</TermInfo>

I wanted to cluster all nested tags under Tag A. I.e., for each A, its B,C,D and I together, etc. But using the above handler the output is like A-B-C-D-I-I-etc. Can I make one object for each A and add other elements into it? How can I include this?

1

There are 1 answers

4
ProgrammersBlock On

I think this is along the lines of what you are asking for. It creates a List of HashMap objects. Every time it starts a TermInfo, it creates a new HashMap. Each endElement inside TermInfo puts a value into the Map. When endElement is TermInfo, it sets fieldMap to null so no intermediate tags are added. "TermInfo" represents A from your description.

public class TestHandler extends DefaultHandler
{
    Map<String, String> fieldMap = null;
    List<Map<String, String>> tags_Value = new ArrayList<Map<String, String>>();
    String temp;

    // ###startElement#######
    public void startElement(String uri, String localName, String qName,
            Attributes attributes) throws SAXException
    {
        if (localName.equals("TermInfo")) // A
        {
            fieldMap = new HashMap<String, String>();
            tags_Value.add(fieldMap);
        }
    }

    // ###########characters###########
    public void characters(char ch[], int start, int length)
        throws SAXException
    {
        temp = new String(ch, start, length);
    }

    // ###########endElement############
    public void endElement(String uri, String localName, String qName)
        throws SAXException
    {
        if (fieldMap != null)
        {
            if (!localName.equals("TermInfo")) // A
            {
                fieldMap.put(localName, temp);
            }
            else
            {
                // END of TermInfo
                fieldMap = null;
            }
        }
    }