Is there a set of XML documents for which we can formulate a RNG, but not an XML Schema (or the other way round)? Can you give me an example, please?
Relaxng schema and XML Schema schema for every XML?
578 views Asked by ivanacorovic AtThere are 2 answers
On
As the OP's exchange with MK has shown, there is some opportunity for misunderstanding here. So I'll offer a response here even though MK has in fact already answered the question.
I assume the question is whether XSD and RNG have the same expressive power, and if not whether one is more expressive than the other.
That is to say: every schema defines a set of documents (namely, the set valid against the schema). There are (as MK says) a great many sets of documents for which neither language can define a schema that accepts as valid just the documents in that set. So perhaps the clearest way to put the question is:
Given a schema in schema language L1, is there guaranteed to be a schema in schema language L2 that accepts as valid the same set of input documents?
Or equivalently:
Are there schemas expressible in language L1 which have no equivalent in language L2?
The answer is that each of the schema languages mentioned can express some schemas not expressible with the other.
I'll leave aside trivia like xsi:type and possible peculiarities of the way Relax NG uses XSD simple types and questions of what exactly we mean by the set of documents accepted by a schema, though these points may be important in some contexts.
RelaxNG schemas with no XSD equivalents:
As MK points out, RelaxNG schemas can control the location of non-whitespace character data:
(a, b, text, c, d)is a legitimate context model in Relax NG that has no equivalent in XSD.So as an example of a Relax NG schema (in compact syntax) with no equivalent in XSD, consider the following:
start = e e = element e { (e, text, e)? }XSD content models must be deterministic (in the jargon of the XSD spec, they must obey the Unique Particle Attribution constraint) while Relax NG content models need not be so. So the possible sequences of moves for a chess game is expressible in Relax NG, but not in XSD:
(white, black)*, white?. (Since every non-deterministic FSA has a deterministic equivalent, it sometimes surprises people that the same is not true for content models: it is not the case that every non-deterministic content model can be rewritten as an equivalent deterministic content model. Anne Brüggemann-Klein identified the set of regular languages for which there are no deterministic content models a couple of decades ago in her Habilitationsschrift.)start = element game { (white, black)*, white? } white = element white {sq} black = element black {sq} sq = attribute square { text }Since Relax NG includes attributes in content models, it is possible in Relax NG to make the effective content model of an element depend on the value of one of its attributes, in ways that are possible in XSD 1.1 but not in XSD 1.0.
start = element whaddyawant { (attribute gimme { 'a' }, a+) | (attribute gimme { 'b' }, b+) } a = element a { empty } b = element b { empty }
XSD schemas with no RelaxNG equivalents:
XSD defines types to support the ID/IDREF constraints expressible in XML DTDs; Relax NG relegates them to a separate 'DTD compatibility' which is (at least in my experience) awkward and error prone in practice. In particular, if a schema declares an attribute of type ID, any wildcard element with wildcard attributes is likely to cause trouble.
XSD schemas can define uniqueness, key, and key reference constraints that are not expressible in Relax NG schemas. Example:
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="digraph" type="DG"> <xs:key name="node-ids"> <xs:selector xpath="node"/> <xs:field xpath="@nodeID"/> </xs:key> <xs:keyref refer="node-ids" name="arc-ends"> <xs:selector xpath="node"/> <xs:field xpath="@arc-to"/> </xs:keyref> </xs:element> <xs:element name="node" type="N"/> <xs:complexType name="DG"> <xs:sequence> <xs:element ref="node" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> <xs:complexType name="N"> <xs:attribute name="nodeID" type="xs:integer"/> <xs:attribute name="arc-to" type="xs:integer" use="optional"/> </xs:complexType> </xs:schema>XSD 1.1 assertions have no analog in Relax NG, and it is possible to express constraints with them that cannot be expressed in Relax NG. E.g. 'in each
districtelement, the value intotal/@nmust be equal to the sum of thenattributes on the other children (number(total/@n) eq sum((* except total)/@n)). Example left as an exercise for the reader.
Note that MK is not quite right to say "XSD can define more precise rules on the cardinality range permitted for child elements". I do not believe there is any cardinality constraint expressible in XSD that cannot be expressed in Relax NG. It is true that it would be rather tedious to express in Relax NG a constraint that says there must be at least one line on an invoice, but not more than 999 lines. But it would certainly be possible. A content model that says there must be at least one but not more than nine a elements is (a, a?, a?, a?, a?, a?, a?, a? a?). It's easy to see how to extend that to handle the case of 999 lines on an invoice.
When you talk of "a set of XML documents for which we can formulate a schema", do you mean that the schema must accept every document within the set, and reject every document outwith the set? In general if you start with an arbitrary set of documents it's very unlikely that you will be able to formulate such a schema, regardless of your choice of schema language. And it's certainly true that the sets of documents that do have that property will be different from one schema language to another.
Moreover, if your set of documents is finite, then it's not really very useful to define such a schema, because it will be impossible to write any new documents that conform to the schema. While if the set of documents is infinite, then the only real way to define your set of documents is by writing the schema that they conform to, which makes the whole thing pointless.
There are some constraints that are expressible in RNG but not in XSD, and there are also some constraints that are expressible in XSD and not in RNG.
For example, RelaxNG can define more precise rules on the content of text nodes in mixed content, while XSD can define more precise rules on the cardinality range permitted for child elements.
The detailed comparison depends on which version of XSD you are talking about.