How to resolve a shift/reduce conflict for nested lists using the same separator in JFLEX/CUP?

84 views Asked by At

I'm trying to implement a parser in JFLEX/CUP (JFlex 1.8.2, Cup 0.11b) for a language in which lists containing different kinds of elements can be nested (e.g. lists of statements and statements containing lists of expressions). I boiled the problem down to the following MWE:

Cup grammar:

terminal            A, B, C, COMMA;
 
non terminal Object list_a, list_b, item_a, item_b;

list_a             ::= list_a COMMA item_a | item_a;
item_a             ::= A | C list_b;
list_b             ::= list_b COMMA item_b | item_b;
item_b             ::= B;

The terminals are defined in the corresponding JFlex file as "a", "b", "c" and "," respectively; whitespace is skipped.

I get the following conflict when trying to compile:

Warning : *** Shift/Reduce conflict found in state #6
  between item_a ::= C list_b (*)
  and     list_b ::= list_b (*) COMMA item_b
  under symbol COMMA
  Resolved in favor of shifting.

Reading through the answers on

  1. Shift/Reduce conflict on C variation grammar
  2. How to avoid a shift reduce conflict in a LALR grammar for parsing nested lists?

was insightful, but did not solve my problem. None of the lists is empty, as in (1), and since the beginning of a list_b is always indicated by a preceding C a list of item_b productions could not possibly be parsed as a list of item_a productions, as in (2).

The problem seems to be related to the fact that list_b in the production item_a is not followed by any other terminal - I don't see a way to avoid this though. Making the following change

item_a             ::= A | C list_b COMMA item_a;

compiles and works for the cases where a list_b production is followed by a further item_a, but of course that does not cover all cases and is also semantically not what I want.

I also don't see a way to resolve the issue by defining the precedence respectively the associativity of COMMA. By doing so I can fix the Shift/Reduce conflict, but the grammar does not behave as desired. E.g. when defining precedence left COMMA;, I get for the input

a,a,a,a,a,c b,b,b,a,a

a syntax error on column 19 (i.e. the first "a" after the last "b").

Changing the grammar or using semantic actions are no options.

If anybody could give me some advice on how to solve this problem, it would be much appreciated.

1

There are 1 answers

0
user11718766 On BEST ANSWER

I think I have managed to solve the problem:

list_a             ::= item_a COMMA list_a | C list_b COMMA list_a | item_a | C list_b;
item_a             ::= A;
list_b             ::= list_b COMMA item_b | item_b;
item_b             ::= B;