I am trying to figure out how to deal with a situation I am having while using the EBNF extension in Jison (Jison by default supports only BNF -- you can activate that option on a need basis).
I am trying to write a simple XML parser. In XML, there are empty tags and non-empty tags. Non-empty tags have a start tag, content, and then and end tag again. The content EBNF rule is defined as follows:
Content
: CHARDATA? (Element CHARDATA?)* {
var children = [];
$1 && children.push($1);
/* This will contain an array of all elements
but no character data ?! */
$2 && children.push($2);
$$ = children;
}
;
Now, I understood by debugging that Jison will assign the capture group to $2 and pass matches in an array. This makes sense as I expect to have a list of matches here. But what really boggles me is why in the contained array, there is only the elements and no character data.
Assume this input string, for example:
<a>h<x/>i<y/>j</a>
Now, the rule above will yield a representation for h, x, and y. But i and j will be missing.
I assume I am missing something but don't know what it could be...
I can provide the full grammar if needed but tried to isolate the problem.
Many thanks in advance!
Best regards, Harald