\[$end\] lookaheads in LALR

184 views Asked by At

I am trying to understand, how bison builds tables for this simple grammar:

input: rule ;
rule: rule '+' '1' 
    | '1' ;

I was able to calculate LR(1) transition table and item sets, but I don't understand how state 3 is built and works:

State 3

1 input: rule .  [$end]
2 rule: rule . '+' '1'

'+'  shift, and go to state 5

$default  reduce using rule 1 (input)

For default reduce rule I should put 'r1' into all cells of GOTO table for each symbol. But for shift rule I should put 's5' into column for '+' terminal (this cell already contains 'r1'). For me this looks like shift/reduce conflict. But not for bison. Please explain how that '[$end]' lookahead symbol appeared in this state, and how this state is processed in overall by LR state machine.

1

There are 1 answers

6
rici On BEST ANSWER
  1. default means "everything else", not "everything". In other words, you first fill in the specified actions, and then use the default action for any other lookahead symbol.

    If there is no default action, the action for any unspecified lookahead symbol is an error. Default reduce actions are often used to reduce table size where the algorithm would otherwise trigger an error. This optimization may have the result of delaying the reporting of an error, but the error will always be detected before another input is consumed, precisely because an error action is never replaced with a default shift. (Indeed, many parser generators never use default shift actions at all.)

  2. If you look at the grammar shown at the beginning of the .output file, you'll see that it has been augmented with the production:

    0 $accept: input $end
    

    Yacc/bison always adds a production like that to ensure that the complete input matches the start symbol. (The non-terminal input will, of course, be replaced by whatever start symbol has been declared with the %start directive, or with the first non-terminal in the grammar.)

    There is nothing special about this rule aside from the fact that reducing the $accept symbol causes the input to be accepted. (You can see that in state 4).

    $end is a special terminal symbol automatically generated when EOF is detected. (To be more precise, it is the terminal whose token value is 0, which the scanner returns to indicate end of file: (f)lex-generated scanners do this automatically.