XInclude not working properly on command line Saxon, but does work with oXygen

105 views Asked by At

I have some XML files transcribing manuscripts that draw from a shared description of the book those manuscripts are in. It's formatted in my XML via three include statements that are laid out as follows:

<xi:include href="../Witness_Descriptions/Trinity_R_3_20.xml" xpointer="publication" xmlns:xi="http://www.w3.org/2001/XInclude"/>
<xi:include href="../Witness_Descriptions/Trinity_R_3_20.xml" xpointer="Trinity_R_3_20" xmlns:xi="http://www.w3.org/2001/XInclude"/>
<xi:include href="../Witness_Descriptions/Minor_Poems_Vol_II.xml" xpointer="MacCracken" xmlns:xi="http://www.w3.org/2001/XInclude"/>

When I run the xslt within oXygen to transform this into html (using oXygen's built-in Saxon-PE 10.6 Transformer) it works fine. However, I need to have the ability to generate comparisons between the whole corpus of documents, which I use a saxon-he 9 transformer for with the following command line:

java -Xms128m -Xmx1024m -XX:+UseCompressedOops -cp saxon9he.jar net.sf.saxon.Query -t -xi -q:test.xq -o:1698815174.72894737321461.xml line=l.1 zone=EETS.ME.I marginalia=text collection=/home/matrygg/minorworksoflydgate.net/XML/Mumming_Eltham

Which collects the xml files from a particular directory using the following two lines and $collection as an externally declared variable:

let $collection:=concat($collection, '?select=*.xml')
let $q:=collection($collection)

If I don't have any includes on the XML files in the Mumming_Eltham directory this works fine and does what I expect. If I do -- as I have started to do as mentioned above -- I get the following error:

Warning: org.xml.sax.SAXParseException; systemId:
  file:/home/matrygg/minorworksoflydgate.net/XML/Mumming_Eltham/Trinity_College_R_3_20_Mumming_Eltham.xml; lineNumber: 22; columnNumber: 142; Include operation failed, reverting to fallback. Resource error reading file as XML (href='../Witness_Descriptions/Trinity_R_3_20.xml'). Reason: XPointer resolution unsuccessful.
Error on line 22 column 142 of Trinity_College_R_3_20_Mumming_Eltham.xml:
  SXXP0003: Error reported by XML parser: An include with href
  '../Witness_Descriptions/Trinity_R_3_20.xml'failed, and no fallback element was found.
Query failed with dynamic error: org.xml.sax.SAXParseException; systemId: file:/home/USERNAME/minorworksoflydgate.net/XML/Mumming_Eltham/Trinity_College_R_3_20_Mumming_Eltham.xml; lineNumber: 22; columnNumber: 142; An include with href '../Witness_Descriptions/Trinity_R_3_20.xml'failed, and no fallback element was found.

It's obvious to me that the issue is that it's not handling relative paths, but I can't find any examples of how to resolve this within the command line interface or the xml, only via fn:resolve-uri in xsl. Is there a way to do this at the command line, or do I have to put in absolute URI's for everything I might want to include? I've checked and the files exist in the location expected and should have permissions to allow them to be parsed.

2

There are 2 answers

2
Michael Kay On

If the base URI of the document containing the xi:include element is correct, then this should work. You haven't shown how this document is read. Presumably it's something to do with the collection parameter to your query, but beyond that, we're guessing.

The other factor that can be relevant here is that it's not actually Saxon that is doing the XInclude processing; it's the XML parser which Saxon invokes. It looks as if XInclude processing has been successfully enabled on the XML parser, so that's half of the battle. The other half is getting the base URI right.

0
medievalmatt On

After hunting around online some more I found this thread on here that deals with a similar issue but with .xslt:

How to resolve XInclude instructions in a XML file from command line with XSLT 3.0

I went ahead and updated the version of Saxon I'm using, added the patch oXygen pointed Martine Hinze to (https://mvnrepository.com/artifact/com.oxygenxml/oxygen-patched-xerces/25.1.0.2) and ran my command line with the following command:

java -Xms128m -Xmx1024m -XX:+UseCompressedOops -cp 'oxygen-patched-xerces-25.1.0.2.jar:saxon-he-12.3.jar' net.sf.saxon.Query -t -xi -q:test.xq -o:1698982391.97692339769168.xml line=l.1 zone=EETS.ME.H.1 marginalia=header collection=file:/home/USERNAME/minorworksoflydgate.net/XML/Mumming_Eltham

This resolved things without a hitch. So should someone come long with similar problems, the solution is to get the patch oXygen uses for the Xerces parser.