Find and delete all occurrences of a string that starts with x

1.1k views Asked by At

I'm parsing an XML file, to compare it to another XML file. XML Diff works nicely, but we have found there are a lot of junk tags that exist in one file, not in the other, that have no bearing on our results, but clutter up the report. I have loaded the XML file into memory to do some other things to it, and I'm wondering if there is an easy way at the same time to go through that file, and remove all tags that start with, as an example color=. The value of color is all over the map, so not easy to grab them all remove them.

Doesn't seem to be any way in XML Diff to specify, "ignore these tags".

I could roll through the file, find each instance, find the end of it, delete it out, but I'm hoping there will be something simpler. If not, oh well.

Edit: Here's a piece of the XML:

<numericValue color="-103" hidden="no" image="stuff.jpg" key="More stuff." needsQuestionFormatting="false" system="yes" systemEquivKey="Stuff." systemImage="yes">
    <numDef increment="1" maximum="180" minimum="30">
        <unit deprecated="no" key="BPM" system="yes" />
   </numDef>
</numericValue>
1

There are 1 answers

6
dbc On BEST ANSWER

If you are using Linq to XML, you can load your XML into an XDocument via:

        var doc = XDocument.Parse(xml); // Load the XML from a string

Or

        var doc = XDocument.Load(fileName); // Load the XML from a file.

Then search for all elements with matching names and use System.Xml.Linq.Extensions.Remove() to remove them all at once:

        string prefix = "L"; // Or whatever.

        // Use doc.Root.Descendants() instead of doc.Descendants() to avoid accidentally removing the root element.
        var elements = doc.Root.Descendants().Where(e => e.Name.LocalName.StartsWith(prefix, StringComparison.Ordinal));
        elements.Remove();

Update

In your XML, the color="-103" substring is an attribute of an element, rather than an element itself. To remove all such attributes, use the following method:

    public static void RemovedNamedAttributes(XElement root, string attributeLocalNamePrefix)
    {
        if (root == null)
            throw new ArgumentNullException();
        foreach (var node in root.DescendantsAndSelf())
            node.Attributes().Where(a => a.Name.LocalName == attributeLocalNamePrefix).Remove();
    }

Then call it like:

        var doc = XDocument.Parse(xml); // Load the XML

        RemovedNamedAttributes(doc.Root, "color");