I'm trying to write a schema for some XML documents using RELAX-NG, and when I use it with jing, I get an error message I don't understand:
C:\tmp\xml>java -jar jing.jar -c list-test2.rnc list-test.xml
C:\tmp\xml\list-test2.rnc:6:10: error: repeat of "string" or "data" element
Can anyone explain why and help me with a workaround?
Here is a sample document (contrived for simplicity):
list-test.xml:
<?xml version="1.0" encoding="UTF-8"?>
<list-test>
<list name="list1">
foo.bar.baz
quux
be.bop.a.loo.bop
<hole name="somename" />
tutti.frutti
abc678.foobar
</list>
<list name="list2">
test1
test2
test3
<hole name="hole1" />
<hole name="hole2" />
test4
<hole name="hole3" />
</list>
</list-test>
Here is a schema that works OK:
list-test.rnc:
grammar {
start = element list-test { list-test-content }
list-test-content =
(element list { list-content })*
list-content =
attribute name { text },
(text | hole-element)*
hole-element =
element hole { hole-content }
hole-content =
attribute name { text }
}
but when I try to replace the generic text
nodes with specific text patterns, I get the error.
list-test2.rnc:
grammar {
start = element list-test { list-test-content }
list-test-content =
(element list { list-content })*
list-content =
attribute name { identifier },
(qualified-identifier | hole-element)*
hole-element =
element hole { hole-content }
hole-content =
attribute name { identifier }
identifier =
xsd:token { pattern="[A-Za-z_][A-Za-z_0-9]*" }
qualified-identifier =
xsd:token { pattern="[A-Za-z_][A-Za-z_0-9]*(\.[A-Za-z_][A-Za-z_0-9]*)*" }
}
You've bumped up against one of RELAX NG's basic limitations: an element's content can be complex (with text patterns, element patterns, sequence patterns, interleave patterns, and quantifier patterns) or simple (with data patterns, value patterns, and list patterns), but not both at the same time. (Of course, it's possible to have a choice between complex and simple content.)
You really can't do better than to use text here, and maybe write a Schematron rule or two.