Java StringBuilder remove XML Tags

145 views Asked by At

I'm trying to pass a XML object through a StringBuilder to compare to Objects, fitting to my needs.

I was wondering if there is a nice way to remove specific tags from the whole String. Underneath I prepared an example:

Original:

<ApprovalSet>
   <ApprovalItem application="Annotext" id="a089989361v451cag47e9f5e9a35716" name="ApprovalItemName" nativeIdentity="xxx12345" operation="Add" state="Finished" value="G1">
       <ApprovalItemComments>
          <Comment author="Random Guy" comment="THE NAME" date="1657122647591"/>
      </ApprovalItemComments>
   </ApprovalItem>
</ApprovalSet>

Desired Outcome:

<ApprovalSet>
   <ApprovalItem application="Annotext" name="ApprovalItemName" nativeIdentity="xxx12345" operation="Add" value="G1"/>
</ApprovalSet>

So basically, I want to remove the id, state and the whole comment and close the ApprovalItem Tag (or just remove all slashes).

Any ideas? Thank you in advance :)

Jonas

1

There are 1 answers

0
Eritrean On

Instead of string manipulation or regex, I would recomend to use a XML-parser. If all you need is to remove some elemnts and some attributes, you might want to look into Jsoup, which is actually a HTML-parser but can also handel XML and is very intuitive to work with. Using Jsoup your code could look like:

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.parser.Parser;
import org.jsoup.select.Elements;

public class Example {
    public static void main (String[] args) {
        String originalXml = 
                  "<ApprovalSet>\n"
                + "   <ApprovalItem application=\"Annotext\" id=\"a089989361v451cag47e9f5e9a35716\" name=\"ApprovalItemName\" nativeIdentity=\"xxx12345\" operation=\"Add\" state=\"Finished\" value=\"G1\">\n"
                + "       <ApprovalItemComments>\n"
                + "          <Comment author=\"Random Guy\" comment=\"THE NAME\" date=\"1657122647591\"/>\n"
                + "      </ApprovalItemComments>\n"
                + "   </ApprovalItem>\n"
                + "</ApprovalSet>";

        Document doc = Jsoup.parse(originalXml, "", Parser.xmlParser());

        Element approvalSet  = doc.selectFirst("ApprovalSet");
        Element approvalItem = doc.selectFirst("ApprovalItem");
        Elements comments    = approvalItem.select("ApprovalItemComments");

        approvalItem.removeAttr("id").removeAttr("state");
        comments.remove();

        String result = approvalSet.toString();

        System.out.println(result);
    }
}

output

<ApprovalSet> 
 <ApprovalItem application="Annotext" name="ApprovalItemName" nativeIdentity="xxx12345" operation="Add" value="G1">  
 </ApprovalItem> 
</ApprovalSet>

mvn dependency

<!-- https://mvnrepository.com/artifact/org.jsoup/jsoup -->
<dependency>
    <groupId>org.jsoup</groupId>
    <artifactId>jsoup</artifactId>
    <version>1.13.1</version>
</dependency>