I'm writting a grammar for arithmetic expression in prefix notation. However I have an issue when parsing negative numbers or substraction. Grammar example is this:
precedence right +, -;
precedence right *, /;
precedence right uminus;
E ::= + E E
| - E E
| * E E
| / E E
| ( E )
| - E %prec uminus
| id
| digit
;
But if my input is - 5 4
, it reduces 5
as E
, next it reduces - E
(negative) and then parser gives me a syntax error at 4
. The correct one should be 5
as E
, next 4
as E
and then - E E
as E
. How can I solve this problem using associativity? or do I need to rewrite my grammar?
(Promoted from comment)
Your grammar really is ambiguous, and precedence declarations won't help you a bit.
Consider the input the input consisting of
N
-
tokens, followed byM
1
tokens.In order for this to be an expression,
M-1
of the-
tokens must be binary, and the remainingN-(M-1)
unary, but there is no way to tell which is which (unless they are all binary).Even if you arbitrarily say that the first
N-(M-1)
-
s are unary, you can't tell what the value ofN-(M-1)
is until you read the entire input, which means you can't parse with a finite lookahead.But the whole point of prefix notation is to avoid the need for parentheses. Arbitrary declarations like the above make it impossible to represent alternative interpretations, so that some expressions would be impossible to represent in prefix notation. That's just plain wrong.
Here's a simple case:
is either
In prefix notation, you need to declare the "arity" of every operator, either implicitly (every operator has a known number of arguments), or explicitly using a notation like this, borrowed from Prolog:
Alternatively, you can delimit the arguments with mandatory parentheses, as with Lisp/Scheme "s-exprs":