Please see the following code:
import xml.etree.ElementTree as ET
for x in ("<a><b /><c><d /></c></a>", "<a><q /><b /><c><d /></c></a>", "<a><m /><q /><b /><c><d /></c></a>"):
root = ET.fromstring(x)
for e in root: root.remove(e)
print(ET.tostring(root))
I expect it to output <a></a>
in all instances but instead it gives:
b'<a><c><d /></c></a>'
b'<a><b /></a>'
b'<a><q /><c><d /></c></a>'
I totally don't grok this. I don't see any pattern to the specific elements that were removed either.
The documentation merely says:
Removes subelement from the element. Unlike the find* methods this method compares elements based on the instance identity, not on tag value or contents.
What am I doing/assuming wrong? I am getting basically the same output with both Python 2.7.5 and 3.4.0 on Kubuntu Trusty.
Thanks!
This demonstrates the problem:
So, modifying the object that you are iterating affects the iteration. This is not entirely unexpected, it is the same if you alter a list while iterating over it:
As a workaround you can repetitively remove the first subelement like this:
Output
This works because the iterator is not varying while the loop is executed. Or, if you want to remove all subelements of the root element and its all of its attributes, you can use
root.clear()
: