Getting valueless elements in python lxml

208 views Asked by At

I've been trying to use the lxml package's "objectify" to parse my XMLs and I've come across a problem. If I have a valueless tag, I can't seem to find a way to get its attributes.

For instance:

import lxml.objectify

xml_obj = lxml.objectify.fromstring("""
<A>
    <B foo="baz"/>
    <B foo="bar"/>
</A>""")
print xml_obj.getchildren()

A = None [ObjectifiedElement]
    B = u'' [StringElement]
      * baz = 'boo'
    B = u'' [StringElement]
      * foo = 'bar'

As you can see, the two B tags are turned into StringElement, but as seen when dumping the object, there should still be a way to retrieve the attributes!

1

There are 1 answers

1
unutbu On BEST ANSWER
import lxml.objectify as objectify
import lxml.etree as ET

content = """
<A>
    <B foo="baz"/>
    <B foo="bar"/>
</A>"""
xml_obj = objectify.fromstring(content)
print(xml_obj.getchildren())
# [u'', u'']

You can access the element's attributes using elt.attrib:

for child in xml_obj.getchildren():
    print(child.attrib)
# {'foo': 'baz'}
# {'foo': 'bar'}

You can modify those attributes as well:

xml_obj.B.attrib['baz'] = 'boo'
xml_obj.B[1].attrib['foo'] = 'blah'

Serializing xml_obj with ET.tostring shows the result:

print(ET.tostring(xml_obj, pretty_print=True))
# <A>
#   <B foo="baz" baz="boo"/>
#   <B foo="blah"/>
# </A>