I'm using xml.etree.ElementTree
to parse an XML file.
How can I force it to either strip text of whitespaces (just regular spaces, not  
) or leave spaces and ignore escapes (leave them as is)?
Here is my problem:
xml_text = """
<root>
<mytag>
data_with_space 
</mytag>
</root>"""
root = xml.etree.ElementTree.fromstring(xml_text)
mytag = root.find("mytag")
print "original text: ", repr(mytag.text)
print "stripped text: ", repr(mytag.text.strip())
It prints:
original text: '\n data_with_space \n '
stripped text: 'data_with_space'
What I need:
'data_with_space '
or (which I can escape by other means):
'data_with_space '
A solution using xml.etree.ElementTree
is preferable because I'd have to rewrite a whole lot of code otherwise
The standard XML library treats
 
and' '
as equal. There's no way to avoid the equalization if you directly applyfromstring(xml_text)
, and therefore it's impossible to differentiate them then. The only way to stop the escaping is to translate it into something else before applyfromstring()
, and translate it back after then.You would get: