Nearley parser - how to return indefinite string of matches without ambiguity? (Only four lines)

161 views Asked by At

I am writing software meant to make it easy to publish your choose-your-own-adventure story. However, I wanted to change my parser to the Nearley system from JavaScript I wrote myself.

I have a four line nearley parser:

main->(excludebrackets link:+ excludebrackets):+
link->"[LINK:"i excludebrackets "|" excludebrackets "]"
{% (d) => {return'<a href ="func__' + d[3][0].join("") + '()">'+d[1][0].join("")+"</a>"}%} 
excludebrackets->[^\\[\]]:+ | null

The only problem is the very top line. The "link" nonterminal does an excellent job of turning things like:

[LINK: shoot | shoot_dragon] into <a href ="func__ shoot_dragon()"> shoot </a>. But if I try to use more complex code:

You could [LINK: shoot | shoot_dragon] the dragon with your arrows or [LINK: draw | stab_dragon] your sword, but you'd have to let it get close.

my function is ambiguous, and thus returns many results. (It seems easy to work with because of the way javascript handles nulls, but this is still at the best case slower than it need be.)

The more general question, is how can I return an indefinite series of two matches, without ambiguity?

(As a bonus, can anyone explain what the :*, :+, :? mean exactly? I don't get the question mark.)

1

There are 1 answers

2
rici On BEST ANSWER

If by "how can I return an indefinite series of two matches, without ambiguity?", You mean "what's the unambiguous equivalent of a a when a can match the empty input?", the answer is that since a a is necessarily ambiguous, the only solution is to remove one of them (which matches the same syntax).

Since excludebrackets is already an arbitrary repetition, there's no need to worry about matching zero or more than one of them. That's automatic. So you can use the much simpler

main -> (excludebrackets link):* excludebrackets

Note that you could have defined excludebrackets without the explicit | null by using the :* repetition operator ("zero or more repetitions") instead combining :+ ("one or more repetitions) combined with a null alternative. For any X, X+ | null is the same as X*, by definition.