I'm working on a parser and I'm really frustrated. In the language, we can have an expression like:
new int[3][][]
or
new int[3]
Most of it parses correctly, except for the empty arrays at the end. In my parser I have:
Expression : int
char
null
(...many others...)
new NewExpression
and then a NewExpression is:
NewExpression : NonArrayType '[' Expression ']' EmptyArrays
| NonArrayType '[' Expression ']'
and then EmptyArrays is one or more empty braces - if EmptyArrays derives the empty string, it adds 20 shift/reduce conflicts:
EmptyArrays : EmptyArrays EmptyArray
| EmptyArray
EmptyArray : '[' ']'
However, when I look in the .info
file for the parser, I get this:
State 214¬
¬
▸ NewExpression -> NonArrayType lbrace Expression rbrace . EmptyArrays (rule 80)¬
▸ NewExpression -> NonArrayType lbrace Expression rbrace . (rule 81)¬
¬
▸ dot reduce using rule 81¬
▸ ';' reduce using rule 81¬
▸ ',' reduce using rule 81¬
▸ '+' reduce using rule 81¬
▸ '-' reduce using rule 81¬
▸ '*' reduce using rule 81¬
▸ '/' reduce using rule 81¬
▸ '<' reduce using rule 81¬
▸ '>' reduce using rule 81¬
▸ '<=' reduce using rule 81¬
▸ '>=' reduce using rule 81¬
▸ '==' reduce using rule 81¬
▸ '!=' reduce using rule 81¬
▸ ')' reduce using rule 81¬
▸ '[' reduce using rule 81 --I expect this should shift
▸ ']' reduce using rule 81¬
▸ '?' reduce using rule 81¬
▸ ':' reduce using rule 81¬
▸ '&&' reduce using rule 81¬
▸ '||' reduce using rule 81
I expect though that if we're in state 214 and we see a left brace, we should shift it onto the stack and continue to parse EmptyArrays.
I'm not exactly sure what is going on because when I strip all of the excess out of the baggage (eg) by starting the parse with NewExpression
, the additional brackets parse correctly. It's not possible for an Expression or a Statement or any non-terminal in the grammar to start with a left brace. Especially because I have a similar rule for if/else statements, which generates a shift/reduce conflict, but chooses to shift if the next token is an else (this problem is well documented).
Can you help me figure out what is going wrong? I really appreciate your help, I am really tilting at windmills trying to figure out the problem.
You probably have a precedence set for '[' and/or ']' with something like
%left '['
which causes this behavior. Remove that precedence declaration, and this will reveal the shift/reduce conflict you have here. As for why its a shift/reduce conflict, you probably also have a rule:for an array access. The problem being that since a
NewExpression
is anExpression
it may be followed by an index like this, and when looking at the lookahead of '[', it can't tell whether that's the beginning of an index expression or the beginning of anEmptyArray
-- that would require 2-token lookahead.One thing you could try for this specific case would be to have your lexer do the extra lookahead needed here and recognize
[]
as a single token.