expr = Ref('expr')
block = '{' + Repeat(expr) + '}'
expr = block | Token(re='[0-9]')
START = expr
Here's a grammar using the lrparsing
module for Python. The module reports no conflicts in the grammar.
It fails to parse the string {{0}}
with the error
lrparsing.ParseError: line 1 column 5: Got '}' when expecting __end_of_input__ while trying to match block in state 11
The stack state step by step is:
shift '{'; ItemSet:5='{'
shift '{'; ItemSet:5='{' ItemSet:5='{'
shift /[0-9]/; ItemSet:4=/[0-9]/ ItemSet:5='{' ItemSet:5='{'
reduce '}'; ItemSet:4=/[0-9]/ -- ItemSet:7=expr ItemSet:5='{' ItemSet:5='{'
reduce '}'; ItemSet:7=expr -- ItemSet:9=expr ItemSet:5='{' ItemSet:5='{'
shift '}'; ItemSet:11='}' ItemSet:9=expr ItemSet:5='{' ItemSet:5='{'
Which afaik means it's shifting {{0
, then upon seeing }
reducing 0
to expr
, then reducing on }
again without having shifted it first, which confuses the bajeezus out of me.
Is this a bug, or am I doing something infinitely simple and stupid?
If it's my grammar, how would I refactor it to satisfy my eternally steamy and passionate desires? If it's a bug, could someone direct me to a python module that has syntax most similar to lrparsing that does work?
EDIT: Refactoring as:
blocks = Ref('blocks')
block = Ref('block')
expr = Ref('expr')
blocks = blocks + block | THIS*0 # THIS*0 is the idiomatic way of getting the empty string
block = '{' + expr + '}'
expr = blocks | Token(re='[0-9]')
START = expr
Allows for proper parsing. The question for me now, is... why? I feel like lrparsing
would've complained to me earlier about any parsing problems... Does Repeat
simply not work the way I expect it to?
lrparsing has a bug; it does not consider the recursive repeats correctly.
Your actual problem can be solved by simple recursion, as you did in your extended edit, though with a bit less clutter.
Note also, that your original grammar would have allowed input such as
{{0{1}}}
. (The reason being that the repeatable part can be expanded to either simple numbers orexpr
again.) Considering your second grammar, you probably didn't want that anyways.I did get some work done with pyparsing, but the syntax is considerably different. Analogous example:
Output:
Hope that helps.