Simple ambiguous grammar with reduce-reduce conflict

415 views Asked by At

The following simple grammar to parse a logical expression results in a reduce/reduce conflict:

%token AND OR
%token NUMBER VARIABLE
%%
logical_expr
    : logical_expr AND logical_term
    | logical_expr OR logical_term
    | logical_term
    ;
logical_term
    : VARIABLE
    | comparison
    | '(' logical_expr ')'
    ;
comparison
    : expr '<' expr
    | expr '>' expr
    ;
expr
    : expr '+' term
    | expr '-' term
    | term
    ;
term
    : NUMBER
    | VARIABLE
    | '(' expr ')'
    ;
%%

The status report from bison has:

state 2

    4 logical_term: VARIABLE .
   13 term: VARIABLE .

    ')'       reduce using rule 4 (logical_term)
    ')'       [reduce using rule 13 (term)]
    '<'       reduce using rule 13 (term)
    '>'       reduce using rule 13 (term)
    '+'       reduce using rule 13 (term)
    '-'       reduce using rule 13 (term)
    $default  reduce using rule 4 (logical_term)

I'm guessing the problem is that it can't figure out how to parse "(a) + 1 < 2". How does one disambiguate this grammar? Is it possible?

1

There are 1 answers

2
Chris Dodd On BEST ANSWER

The basic problem with your grammar is that when you seen ( VARIABLE and the next token is ), the parser can't tell whether this should be a parenthesized expr or logical_expr -- it depends on the next token AFTER the ). If that next token is +. -, < or > then its an expr, while if it's AND or OR (or EOF), then its a logical_expr.

The usual solution to this is to NOT try to do type-checking in the grammar. While its possible, it requires extra lookahead and may require multilevel grammars or such complexity.

In your case, if you change the logical_term rule to

logical_term
    : comparison
    | expr
    ;

the conflict goes away, but your parser will accept things that are not type correct, such as a > 3 AND 2 or 2 + 2 OR 7. You'll need to do type-checking of your resulting parse tree (or whatever data structures you are creating) for correctness, though you would likely need that anyways (at the very least you already needed to typecheck VARIABLE to ensure that the variable was numeric or boolean, depending on context.)