I have the following context-free grammar for a simplified version of C++. When I run it with JFLEX and CUP I get a list of errors like that:
Warning : *** Reduce/Reduce conflict found in state #173
between especificador ::= (*)
and programa ::= (*)
under symbols: {VOID, CHAR, FLOAT, DOUBLE, SIGNED, UNSIGNED, INT, SHORT, LONG}
Resolved in favor of the second production.
I think the problem is with instrucoesIf but I can't figure it out
start with programa;
programa ::= especificador tipo ID programa2 | DEFINE ID num CRLF programa | ; verificar depois o ERRO
especificador ::= AUTO | STATIC | EXTERN | CONST | ;
tipo ::= VOID | CHAR | FLOAT | DOUBLE | SIGNED inteiro | UNSIGNED inteiro | inteiro;
inteiro ::= SHORT | INT | LONG;
programa2 ::= SEMICOLON programa | LBRACK num RBRACK SEMICOLON programa | LPAREN listaParametros RPAREN bloco programa | COMMA listaID programa;
listaID ::= ID declaracaoParam2 listaIDTail;
listaIDTail ::= SEMICOLON | COMMA listaID;
listaParametros ::= listaParamRestante | ;
listaParamRestante ::= declaracaoParam declParamRestante;
declaracaoParam ::= tipo ID declaracaoParam2;
declaracaoParam2 ::= LBRACK num RBRACK | ;
declParamRestante ::= COMMA listaParamRestante | ;
bloco ::= LBRACE conjuntoInst RBRACE | SEMICOLON conjuntoInst ;
conjuntoInst ::= programa conjuntoInst | instrucoes conjuntoInst | ;
instrucoes ::= ID expressao SEMICOLON | RETURN expr SEMICOLON | PRINTF LPAREN expr RPAREN SEMICOLON | SCANF LPAREN ID RPAREN SEMICOLON | BREAK SEMICOLON | IF LPAREN expr RPAREN instrucoes instrucoesIf;
instrucoesIf ::= ELSE instrucoes | ;
expressao ::= atribuicao | LBRACK expr RBRACK atribuicao | LPAREN exprList RPAREN | ;
atribuicao ::= operadorAtrib expr;
operadorAtrib ::= EQ | MULTEQ | DIVEQ | MODEQ | PLUSEQ | MINUSEQ;
expr ::= exprAnd exprOr;
exprList ::= expr exprListTail | ;
exprListTail ::= COMMA exprList | ;
exprOr ::= OR exprAnd exprOr | ;
exprAnd ::= exprEqual exprAnd2;
exprAnd2 ::= AND exprEqual exprAnd2 | ;
exprEqual ::= exprRelational exprEqual2;
exprEqual2 ::= EQEQ exprRelational exprEqual2 | NOTEQ exprRelational exprEqual2 | ;
exprRelational ::= exprPlus exprRelational2;
exprRelational2 ::= LT exprPlus exprRelational2 | LTEQ exprPlus exprRelational2 | GT exprPlus exprRelational2 | GTEQ exprPlus exprRelational2 | ;
exprPlus ::= exprMult exprPlus2;
exprPlus2 ::= PLUS exprMult exprPlus2 | MINUS exprMult exprPlus2 | ;
exprMult ::= exprUnary exprMult2;
exprMult2 ::= MULT exprUnary exprMult2 | DIV exprUnary exprMult2 | ;
exprUnary ::= PLUS exprParenthesis | MINUS exprParenthesis | exprParenthesis;
exprParenthesis ::= LPAREN expr RPAREN | primary;
primary ::= ID primaryID | num | literal;
primaryID ::= LBRACK primary RBRACK | LPAREN exprList RPAREN | ;
literal ::= STRING | CHAR;
num ::= NUM_INT | NUM_FLOAT;
CUP is a LALR parser generator, which means that there is no need to avoid left-recursion and so you can use a more natural grammatical style; there is no need to split lists into "start" and "continuation" productions, which are hard to read and error-prone, and furthermore often do not allow the direct construction of an accurate syntax tree.
There are lots of conflicts in your grammar, but this one is clearly ambiguous:
Note that
programa
also has an empty alternative. So there could be any number of emptyprograma
s at the beginning (or, indeed, in the middle) of aconjuntoInst
. You need to be very careful with non-terminals which can derive nothing; you must avoid ever using them in an undelimited list, because the parser cannot tell how many consecutive empty non-terminals are present in the source text.Note that completely empty statements are not allowed in C++, nor in C. A completely empty statement would produce exactly this kind of ambiguity. An empty statement must still be terminated with a
;
. This makes it possible to have lists of statements.So the usual model for C statements (including blocks) (with lots of details left out) is: