Let RNG be a relax NG specification, XML be a valid XML document relative to RNG, designate n be the nth character of the XML document, B a node of XML and XML[n = B] the XML document where the node B is inserted at the character n.
I want to define a function suggest(RNG, XML, n) → {Node} = O such that
- for any
B ∈ O,XML[n = B]is a valid XML document relative toRNGand, - for any
C ∉ O,XML[n = C]is not valid.
For now, I have tried two approaches:
Using a double recursion on
XMLandRNG, at the elementXML[n]containing thenthe character and the RNG<element>RNG[n]matchingXML[n], and return a tuple(boolean, suggestions), then decide what to return next based on those two elements. For instance, if the currentRNGnode is a choice betweenAandB, iffn(A)[0]is true andfn(B)[0]is false, return(true, fn(A)[1]), if they are both false then(false, []), and if they are both true, ..., well(true, fn(A)[1] + fn(B)[1])might not be too bad, but certainly isn't always the right anwser.Using a "state machine", with four piles (say
*rwith the children ofRNG[n],*xwith the children ofXML[n],*cwith the current choices that are being solved, and*swith the output suggestions), such that
- if
*requals*x, then pop*rand*x, if*rdoesn't equal*x, then pop*rand push it in*s, - if
*ris a choice betweenAandB, then push(*r, *x, B, null)in*cand pushAin*r. - if
*ris a ref, pop it and push the matching definition, if it is a definition process it as a group, - if
*ris a group, push its element to*r - if
*c[0]equals*r, and*c[2]is a node, then pop it, set*rand*xto*c[2],*c[0]and*c[1]respectively, push(*r, *x, null, *s)in*c - if
*c[0]equals*rand*c[2]is null, then pop it, compare*sand*c[3], then, decide (and that's quite hard), which one to keep. - simplify the other nodes in terms of choices and groups (not so hard), such that they only have two elements.
But both have not been so successful:
- Both a are horrifingly slow on a large schema
- The first one fails frequently because of too much recursion, especially if the RNG contains multiple
<ref>s, and it gets quite difficult to check for infinite loops too. - The second one uses an awful lot of memory, because I have to deep copy each XML nodes on every
<choice>I encounter. - Writing tests is hard because of the number of possible cases, and I can't find a way to generate them programatically. So actually, I am not even sure my function produces a valid XML document.
- It is still not clear to me how I should be handling
<choice>s. For instance, take<choice><element name="a" /><group><element name="a" /><element name="b" /></group></choice>: how should I encode this so that all valid nodes are suggested?
I know that multiple editors (e.g. http://www.thaiopensource.com/nxml-mode/ for Emacs) offer this option, but I am not sure they do indeed produce valid documents, and I am not familiar enough to read through Lisp.
So I am searching for a "good way" to implement this on DOM, or, even better a JavaScript library that would allow to do that.