parsimonious - Rule 'rules' matched in its entirety, but it didn't consume all the text

1.6k views Asked by At

I am making a simple parser for expressions and this is my code:

import parsimonious as parmon

parser = parmon.Grammar(r"""
            E = E "+" E / id
            id = "0"/"1"/"2"/"3"/"4"/"5"/"6"/"7"/"8"/"9"
    """)

code = "2+2"

print(parser.parse(code))

I get this error:

IncompleteParseError(text, node.end, self)
parsimonious.exceptions.IncompleteParseError: Rule 'rules' matched in its entirety, but it didn't consume all the text. The non-matching portion of the text begins with '/ id
            id = "0"/"1"' (line 2, column 16).

I have also tried Lark-parser but couldn't get to work on that either. Help appreciated.

2

There are 2 answers

0
Bill Bell On

I can't offer anything wrt any of the parsers you mentioned. Have you considered pyparsing?

  • id is defined to be a one-digit numerical token.
  • Forward indicates that E will be defined later in the code. (It's analogous to the use of 'forward' in procedural languages.)
  • The << operator inserts the definition of E into itself. The parentheses call for a 'match first,' meaning that the first expression in the 'or' will be applied, if possible.
  • The parser is exercised within the two print functions.

Here's a simple parser for that kind of expression.

from pyparsing import *

id = Word(nums, min=1, max=1)
E = Forward()
E << (id + '+' + E | id)

code = '2 + 2'

print (E.parseString(code))

print (E.parseString('3+4+5'))

This codes yields this result.

['2', '+', '2']
['3', '+', '4', '+', '5']
0
sophros On

Maybe it is worth to elaborate on the @rici's comment with a solution to your problem:

E = E "+" E / id means in fact: E = E "+" (E / id) which is a non-ending recursive definition:

E = E "+" E / id when E is substituted for the right side:

E = (E "+" (E / id)) "+" (E / id), etc.

This means that although the right operand of + is matched right away in your example expression (picking id production which is the terminal character 2) there would still be (neverending) doubts how to match the left side.

That is why the EBNF you provided is wrong and changing it to:

E = ( E "+" E ) / id

solves the problem.