Python lxml using xi:include with multiple Xml fragments

3.5k views Asked by At

I am developing a simple xml logfile class using lxml in Python.

My approach so far has been to use two files. A well-formed XML file which includes a second file which is a XML fragment. I am using an xi:include element. This way XML fragment can be updated efficiently by simply appending <event> elements to the end of the file.

The well formed XML file ('logfile.xml') looks like this:

 <?xml version="1.0"?>
<logfile>
<event xmlns:xi="http://www.w3.org/2001/XInclude">
    <xi:include href="events.xml"/>
</event>
</logfile>

The xml fragment ('events.xml') looks like this:

<event>
     <data></data>
</event>
<event>
     <data></data>
</event>
<event>
     <data></data>
</event>

My goal is to end up with:

<?xml version="1.0"?>
 <logfile>
    <event>
         <data></data>
    </event>
    <event>
         <data></data>
    </event>
    <event>
         <data></data>
    </event>
 </logfile>

In python I am using the xinclude method to process the xi:include element in my well formed XML file ('logfile.xml'). This works but only if there is one <event> element is the XML fragment ('events.xml')

My python code:

tree = etree.parse('logfile.xml')
tree.xinclude()
root = tree.getroot()
print etree.tostring(self.logfileNode, pretty_print=True, xml_declaration=True, encoding='UTF-8')

The error I am seeing:

lxml.etree.XIncludeError: Extra content at the end of the document

I could contain the events in another element, but this doesn't lend itself well to appending data to the end of the XML fragment document.

1

There are 1 answers

5
Francis Avila On

The document referenced by xi:xinclude must be a complete xml document ("complete xml infoset"). Your events.xml is not a valid xml document because you don't have a single root containing element.

You might be able to include just a subset by using the xpointer attribute to select the event elements. I'm not sure lxml supports this attribute, however.