Bison, C++ GLR parsing: how to force shift\reduce conflict?

1.4k views Asked by At

How can I force the shift\reduce conflict to be resolved by the GLR method?
Suppose I want the parser to resolve the conflict between the right shift operator and two closing angle brackets of template arguments for itself. I make the lexer pass the 2 consecutive ">" symbols, as separate tokens, without merging them into one single ">>" token. Then i put these rules to the grammar:

operator_name:  
     "operator" ">"  
   | "operator" ">" ">"  
;  

I want this to be a shift\reduce conflict. If I have the token declaration for ">" with left associativity, this will not be a conflict. So I have to remove the token precedence\associativity declaration, but this results in many other conflicts that I don't want to solve manually by specifying the contextual precedence for each conflicting rule. So, is there a way to force the shift\reduce conflict while having the token declared?

2

There are 2 answers

1
Tavian Barnes On

I believe that using context-dependent precedence on the rules for operator_name will work.

The C++ grammar as specified by the updated standard actually modifies the grammar to accept the >> token as closing two open template declarations. I'd recommend following it to get standard behaviour. For example, you must be careful that "x > > y" is not parsed as "x >> y", and you must also ensure that "foo<bar<2 >> 1>>" is invalid, while "foo<bar<(2 >> 1)>>" is valid.

1
umlcat On

I worked in Yacc (similar to Bison), with a similar scenario.

Standard grammars are, sometimes, called "parsing directed by syntax".

This case is, sometimes, called something like "parsing directed by semantics".

Example:

...
// shift operator example
if ((x >> 2) == 0)
...
// consecutive template closing tag example
List<String, List<String>> MyList =
...

Lets remember, our mind works like a compiler. The human mind can compile this, but the previous grammars, can't. Mhhh. Lets see how a human mind, would compile this code.

As you already know, the "x" before the consecutive ">" and ">" tokens indicates an expression or lvalue. The mind thinks "two consecutive greater-than symbols, after an expresion, should become a single shift operator token".

And for the "string" token: "two consecutive greater-than symbols, after a type identifier, should become two consecutive template closing tag tokens".

I think this case cannot be handled by the usual operator precedence, shift or reduce, or just grammars, but using ( "hacking" ) some functions provided by the parser itself.

I don't see an error in your example grammar rule. The "operator" symbol avoids confusing the two cases you mention. The parts that should be concern its the grammars where the shift operator its used, and the consecutive template closing tags are used.

operator_expr_example:  
  lvalue "<<"  lvalue |  
  lvalue ">>"  lvalue |
  lvalue "&&"  lvalue |
;  

template_params:  
  identifier |
  template_declaration_example |
  array_declaration |
  other_type_declaration 
;  

template_declaration_example:  
  identifier "<"  template_params ">"
;  

Cheers.