YACC: finding shift/reduce conflicts in a grammar

383 views Asked by At

I am reading the book theory of computation and there is a language PL in chapter 2 that is implemented in YACC. The program is very basic. There are grammar rules that are specified, and after running the program, it checks if a given file has the syntax of the specified grammar or not. All the rules are given in the book, and I wanted to implement it.

But when I implemented it, i get the shift/reduce conflict code. I searched the error on the web, and found out that the error refers to grammar ambiguity. I tried finding it but wasn't able to. in here there is a similar question, and a user have pointed out that its a warning and can be ignored since some languages are ambiguous.

Problem:

  • Can someone point out where the ambiguity might be?
  • When i try to run a code such as following, the program doesn't understand it. it gives syntax error. Although This should be accepted based on the grammar rules that I have applied. Am I passing a grammar with wrong syntax?

    while X = 10;
    X = Y + 10;
    end;
    

My code:

    %start program
    %%
    LETTER : 'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'G' | 'H' | 'I'
          | 'J' | 'K' | 'L' | 'M' | 'N' | 'O' | 'P' | 'Q' | 'R'
          | 'S' | 'T' | 'U' | 'V' | 'W' | 'X' | 'Y' | 'Z' 
          ;

    DIGIT : '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
         ;

    name : LETTER
         | name DIGIT
         | name LETTER
         ;


    numeral :   DIGIT 
        |   numeral DIGIT
        ;


    operation   :   '+'
                |   '-'
                |   '*'
                |   '/'
                |   '='
                |   '<'
                |   '>'
                |   '>' '='
                |   '<' '='
                ;

    expression  :   name 
            |   numeral
            |   '(' '!' expression ')'
            |   '(' expression operation expression ')'
            ;

    assignment : name '<' '-' expression
               ;

    instruction : assignment
                | 'g' 'o' 't' 'o' name
                | 's' 't' 'o' 'p'
                ;


    labelinstr : name ':' instruction ';'
               | instruction ';'
               ;

    loop : 'l' 'o' 'o' 'p' expression ';'
         |  name ':' 'l' 'o' 'o' 'p' expression ';'
         ;

    ifthen  :   'i' 'f' expression 't' 'h' 'e' 'n' ';'
            |   name ':' 'i' 'f' expression 't' 'h' 'e' 'n' ';'
            ;
    while   :   'w' 'h' 'i' 'l' 'e' ';'
            |   name ':' 'w' 'h' 'i' 'l' 'e' expression ';'
            ;


    end : 'e' 'n' 'd' ';'
        | name ':' 'e' 'n' 'd' ';'
        ;

    program : labelinstr
            | loop program end
            | while program end
            | ifthen program end
            | ifthen program end 'e' 'l' 's' 'e' ';' program end
            | program program
            ;





    %%
    #include <stdio.h>

    yylex() {
      int c;
      while ( (c=getchar()) == ' ' || c == '\n' || c == '\t') {
         printf("%c",c);}
      printf("%c",c);
      return(c);
     }
2

There are 2 answers

0
n. m. could be an AI On BEST ANSWER

Ambiguity problems are not related to your syntax errors. Consider this:

 while   :   'w' 'h' 'i' 'l' 'e' ';'
         |   name ':' 'w' 'h' 'i' 'l' 'e' expression ';'
         ;

Something is missing in one of the alternatives. You want to loop while something when labeled, and while nothing when unlabeled?

 while   :   'w' 'h' 'i' 'l' 'e' expression ';'
         |   name ':' 'w' 'h' 'i' 'l' 'e' expression ';'
         ;

Aside: you should really factor out the labels. You already have a "labeled statement" production. Use it.

 while X = 10;

An expression is either a simple name or number, or parenthesized, so X = 10 is invalid on its own.

 while (X = 10) ;

This is not an assignment:

 X = Y + 10;

This is:

 X <- (Y + 10) ;

With these fixes there is no longer a syntax error (there are still conflicts but they are unrelated).

2
rici On

The first step in identifying shift/reduce conflicts is to use the -v flag to bison and examine the state machine in the resulting file which will have the suffice .output. That will tell you which states exhibit the error and which rules lead to that state. For example, with your program, we see two states with shift/reduce conflicts, state 65 and state 84.

State 84 is relatively simple:

State 84

   72 program: ifthen program end .
   73        | ifthen program end . 'e' 'l' 's' 'e' ';' program end

    'e'  shift, and go to state 101

    'e'       [reduce using rule 72 (program)]
    $default  reduce using rule 72 (program)

That is similar to the classic "dangling else" problem. Normally using a statement terminator like end; will solve this problem, but the grammar you present curiously insists on having an end; even in the case of an else clause. So

if (a > 3) then a <- 3; else a <- 2; end,

is not valid. Instead, the grammar insists on

if (a > 3) then a <- 3; end; else a <- 2; end;

That doesn't help solve the dangling else problem because the end doesn't distinguish between statements with and without else clauses, so the following is still ambiguous:

if (a > 3) then if (a < 7) then a <- 3; end; else a <- 7; end;

It seems to be unlikely that the grammar is correct. I suspect that the if productions should be:

        | ifthen program end
        | ifthen program 'e' 'l' 's' 'e' ';' program end

The other problem is in state 65: (here, I've omitted the transitions)

State 65

   74 program: program . program
   74        | program program .

That is clearly ambiguous. Suppose you had:

statement statement statement

This could be parsed as either binding left-to-right or right-to-left:

[program: [program: statement] [program: [program: statement] [program: statement]]] 
[program: [program: [program: statement] [program: statement]] [program: statement]] 

Roughly speaking the solution is usually something like:

statement: if_statement
         | loop_statement
         | ...

program: statement
       | program statement

Although personally, I'd probably factor out the labels.