I have the following context-free grammar for a simplified version of C++. When I run it with JFLEX and CUP I get a list of errors like that:
Warning : *** Reduce/Reduce conflict found in state #173
between especificador ::= (*)
and programa ::= (*)
under symbols: {VOID, CHAR, FLOAT, DOUBLE, SIGNED, UNSIGNED, INT, SHORT, LONG}
Resolved in favor of the second production.
I think the problem is with instrucoesIf but I can't figure it out
start with programa;
programa ::= especificador tipo ID programa2 | DEFINE ID num CRLF programa | ; verificar depois o ERRO
especificador ::= AUTO | STATIC | EXTERN | CONST | ;
tipo ::= VOID | CHAR | FLOAT | DOUBLE | SIGNED inteiro | UNSIGNED inteiro | inteiro;
inteiro ::= SHORT | INT | LONG;
programa2 ::= SEMICOLON programa | LBRACK num RBRACK SEMICOLON programa | LPAREN listaParametros RPAREN bloco programa | COMMA listaID programa;
listaID ::= ID declaracaoParam2 listaIDTail;
listaIDTail ::= SEMICOLON | COMMA listaID;
listaParametros ::= listaParamRestante | ;
listaParamRestante ::= declaracaoParam declParamRestante;
declaracaoParam ::= tipo ID declaracaoParam2;
declaracaoParam2 ::= LBRACK num RBRACK | ;
declParamRestante ::= COMMA listaParamRestante | ;
bloco ::= LBRACE conjuntoInst RBRACE | SEMICOLON conjuntoInst ;
conjuntoInst ::= programa conjuntoInst | instrucoes conjuntoInst | ;
instrucoes ::= ID expressao SEMICOLON | RETURN expr SEMICOLON | PRINTF LPAREN expr RPAREN SEMICOLON | SCANF LPAREN ID RPAREN SEMICOLON | BREAK SEMICOLON | IF LPAREN expr RPAREN instrucoes instrucoesIf;
instrucoesIf ::= ELSE instrucoes | ;
expressao ::= atribuicao | LBRACK expr RBRACK atribuicao | LPAREN exprList RPAREN | ;
atribuicao ::= operadorAtrib expr;
operadorAtrib ::= EQ | MULTEQ | DIVEQ | MODEQ | PLUSEQ | MINUSEQ;
expr ::= exprAnd exprOr;
exprList ::= expr exprListTail | ;
exprListTail ::= COMMA exprList | ;
exprOr ::= OR exprAnd exprOr | ;
exprAnd ::= exprEqual exprAnd2;
exprAnd2 ::= AND exprEqual exprAnd2 | ;
exprEqual ::= exprRelational exprEqual2;
exprEqual2 ::= EQEQ exprRelational exprEqual2 | NOTEQ exprRelational exprEqual2 | ;
exprRelational ::= exprPlus exprRelational2;
exprRelational2 ::= LT exprPlus exprRelational2 | LTEQ exprPlus exprRelational2 | GT exprPlus exprRelational2 | GTEQ exprPlus exprRelational2 | ;
exprPlus ::= exprMult exprPlus2;
exprPlus2 ::= PLUS exprMult exprPlus2 | MINUS exprMult exprPlus2 | ;
exprMult ::= exprUnary exprMult2;
exprMult2 ::= MULT exprUnary exprMult2 | DIV exprUnary exprMult2 | ;
exprUnary ::= PLUS exprParenthesis | MINUS exprParenthesis | exprParenthesis;
exprParenthesis ::= LPAREN expr RPAREN | primary;
primary ::= ID primaryID | num | literal;
primaryID ::= LBRACK primary RBRACK | LPAREN exprList RPAREN | ;
literal ::= STRING | CHAR;
num ::= NUM_INT | NUM_FLOAT;
CUP is a LALR parser generator, which means that there is no need to avoid left-recursion and so you can use a more natural grammatical style; there is no need to split lists into "start" and "continuation" productions, which are hard to read and error-prone, and furthermore often do not allow the direct construction of an accurate syntax tree.
There are lots of conflicts in your grammar, but this one is clearly ambiguous:
Note that
programaalso has an empty alternative. So there could be any number of emptyprogramas at the beginning (or, indeed, in the middle) of aconjuntoInst. You need to be very careful with non-terminals which can derive nothing; you must avoid ever using them in an undelimited list, because the parser cannot tell how many consecutive empty non-terminals are present in the source text.Note that completely empty statements are not allowed in C++, nor in C. A completely empty statement would produce exactly this kind of ambiguity. An empty statement must still be terminated with a
;. This makes it possible to have lists of statements.So the usual model for C statements (including blocks) (with lots of details left out) is: