Java cup: Shift/Reduce conflict

486 views Asked by At

I'm trying to write a parser using JFlex and Cup, but I have some issues dealing with recursive pointed notation like function call with recursive properties access like :

var x = obj.property1.functionCall(p1, p2).property2;

Here is the related parser definition:

unary_exp ::= postfix_exp
              | ADD2 unary_exp
              | SUB2 unary_exp
              | unary_operator unary_exp;

unary_operator ::=  ADD
                    | SUB
                    | BIT_NOT
                    | NOT;

postfix_exp ::= primary_exp
                | postfix_exp LPAR_SQ right_value_exp RPAR_SQ
                | postfix_exp LPAR optional_postfix_exp_list RPAR
                | postfix_exp DOT postfix_exp
                | postfix_exp SUB2
                | postfix_exp ADD2;

primary_exp ::= BOOL
                | STRING
                | INTEGER
                | NUMBER
                | NULL;

I got following shift/reduce conflicts :

Warning : *** Shift/Reduce conflict found in state #190
  between postfix_exp ::= postfix_exp DOT postfix_exp (*) 
  and     postfix_exp ::= postfix_exp (*) LPAR optional_postfix_exp_list RPAR 
  under symbol LPAR
  Resolved in favor of shifting.

Warning : *** Shift/Reduce conflict found in state #190
  between postfix_exp ::= postfix_exp DOT postfix_exp (*) 
  and     postfix_exp ::= postfix_exp (*) LPAR_SQ right_value_exp RPAR_SQ 
  under symbol LPAR_SQ
  Resolved in favor of shifting.

Warning : *** Shift/Reduce conflict found in state #190
  between postfix_exp ::= postfix_exp DOT postfix_exp (*) 
  and     postfix_exp ::= postfix_exp (*) DOT postfix_exp 
  under symbol DOT
  Resolved in favor of shifting.

Warning : *** Shift/Reduce conflict found in state #190
  between postfix_exp ::= postfix_exp DOT postfix_exp (*) 
  and     postfix_exp ::= postfix_exp (*) ADD2 
  under symbol ADD2
  Resolved in favor of shifting.

Warning : *** Shift/Reduce conflict found in state #190
  between postfix_exp ::= postfix_exp DOT postfix_exp (*) 
  and     postfix_exp ::= postfix_exp (*) SUB2 
  under symbol SUB2
  Resolved in favor of shifting.

Error : *** More conflicts encountered than expected -- parser generation aborted

May someone explain to me how to deal with these conflicts or where to find a working example using java cup, yacc or bison?

1

There are 1 answers

1
rici On BEST ANSWER

Your grammar includes this production:

postfix_exp ::= postfix_exp DOT postfix_exp

Do you think that's accurate? Can the right operand of . be an arbitrary postfix_exp? If so, what does a.3 mean? How about x.false? (A primary_exp is a postfix_exp, right? That's what the grammar says.)

Closer to home, what is the meaning of a.b++? Is there any sense in which b++ is an operand of .? Surely b is the name of a member of a, not an independent variable.

Grammars convey meaning even though they don't implement semantics. They specify the way different parts of an expression relate to each other. ("Parse" and "parts" sound similar because the verb "parse" comes from the noun "part", as in "to split into parts".) So you should strive to make your grammar correspond to the structure of the language.

Now, any time you see a production like:

foo ::= foo OPERATOR foo;

you know that the grammar is ambiguous. It's ambiguous because it generates both left- and right-associative parses for OPERATOR. IF OPERATOR were -, for example, it would be conflating (1 - 1) - 1 (assuming that foo could produce 1) with 1 - (1 - 1). And grammatical ambiguity always manifests as a parse table conflict.

Resolving the ambiguity is important because those two expressions have different values. Which is why we either have to use a hierarchy of expression types, or operator precedence declarations. (So we know how to solve the associativity problem.)

But in the case of the . operator, we really don't have a binary operator of unspecified associativity. We don't really have a binary operator at all, which is why the production here is called postfix_exp. The operator is a field selector, which we might think of as a collection of different postfix operators, one for each member name of the appropriate composite type.

So that's the simple answer here. As soon as we say that the field selector has the form DOT IDENTIFIER, the ambiguity vanishes.