JavaCC Syntax issue with understand ability

Question

JavaCC Syntax issue with understand ability

286 views Asked by MNM At 30 October 2019 at 01:51

I am starting to learn Javacc and trying to figure out this problem but I can't seem to fully understand if I am doing this right or not.

So what I am doing is making a parser for a custom language and generating Java parser source code using Javacc.

I think I am doing this right but have a lot of doubt on if this is correct or not.

Here is the .jj file I have so far.

options {
  JAVA_UNICODE_ESCAPE = true;
  STATIC = false;
}

PARSER_BEGIN(Custom_Lexer)
  public class Custom_Lexer {}
PARSER_END(Custom_Lexer)


void Custom_Lexer_Program() :
{}
{
  <BEGIN> <CLPL>
  ( Custom_Lexer_Statement() )*
  <END>
  <EOF>
}

void Custom_Lexer_Statement():
{}
{
    STATEMENT()
    <SEMICOLON>
}

void STATEMENT():
{}
{
    LOOKAHEAD(2) OUTPUT_STATEMENT()     |
    LOOKAHEAD(2) INPUT_STATEMENT()      |
    LOOKAHEAD(2) VARIABLE_DECLARATION() | 
    LOOKAHEAD(2) VARIABLE_ASSIGNMENT()  |
    LOOKAHEAD(2) IF_THEN_STATEMENT()
}

void OUTPUT_STATEMENT():
{}
{
    <OUTPUT> <EQUALS> EXPRESSION()
}

void INPUT_STATEMENT():
{}
{
    VARIABLE_DECLARATION()*
}

void VARIABLE_DECLARATION():
{}
{
    <VARIABLE> (<EQUALS> <INT> | <BOOL> | <STRING>)?
}

void VARIABLE_ASSIGNMENT():
{}
{
    <VARIABLE> (<EQUALS> EXPRESSION()
}

void IF_THEN_STATEMENT():
{}
{
    <IF> EXPRESSION() <THEN> VARIABLE_ASSIGNMENT() [<ELSE> VARIABLE_ASSIGNMENT()]
}
//Will define these later after the above issues are fixed
void EXPRESSION():
{}
{
    LOOKAHEAD(5) BINARY_EXPRESSION()        |
    LOOKAHEAD(5) IDENTIFIER_EXPRESSION()    |
    LOOKAHEAD(5) LITERAL_VALUE_EXPRESSION() |
    LOOKAHEAD(5) PARENTHESIZED_EXPRESSION()
}


//Reserved words
TOKEN: { <CLPL:   "CLPL"   > }
TOKEN: { <BEGIN:   "BEGIN"   > }
TOKEN: { <END:     "END"     > }
TOKEN: { <OUTPUT:  "OUTPUT"  > }
TOKEN: { <INPUT:   "INPUT"   > }
TOKEN: { <IF:      "IF"      > }
TOKEN: { <THEN:    "THEN"    > }


TOKEN: { <INT:    "int"      > }
TOKEN: { <BOOL:   "bool"     > }
TOKEN: { <STRING: "string"   > }


TOKEN: { <SEMICOLON:     ";" > }
TOKEN: { <LEFT_PAREN:    "(" > }
TOKEN: { <RIGHT_PAREN:   ")" > }
TOKEN: { <PLUS:          "+" > }
TOKEN: { <MINUS:         "-" > }
TOKEN: { <MULTIPLY:      "*" > }
TOKEN: { <DIVIDE:        "/" > }
TOKEN: { <EQUALITY:     "==" > }
TOKEN: { <EQUALS:        "=" > }
TOKEN: { <GT:            ">" > }
TOKEN: { <LT:            "<" > }


TOKEN: { <BOOLEAN_LITERAL: "true" | "false" > }


TOKEN: { <INTEGER_LITERAL: (["0"-"9"])+ > }


TOKEN: { <STRING_LITERAL: "\"" (~["\"","\\","\n","\r"] | "\\" (["n","t","b","r","f","\\","\'","\""] | ["0"-"7"] (["0"-"7"])? | ["0"-"3"] ["0"-"7"] ["0"-"7"]))* "\""> }


TOKEN: { <IDENTIFIER: (["a"-"z"]|["A"-"Z"]|"_")+((["a"-"z","A"-"Z","0"-"9","_"])*)? > }

Original Q&A

There are 1 answers

**Theodore Norvell** · Accepted Answer · 2019-10-31T01:26:22+00:00

It's unfinished, but looks like a reasonable start. I'd suggest that you avoid all LOOKAHEAD specifications until you understand better what you are doing. Try left factoring so that all choices can be made with the default lookahead method.

One problem I see is that the conflict between VARIABLE_DECLARATION and INPUT_STATEMENT can't be resolved since any VARIABLE_DECLARATION is also an INPUT_STATEMENT.

TechQA.

JavaCC Syntax issue with understand ability

There are 1 answers

Related Questions in JAVA

Related Questions in PARSING

Related Questions in JAVACC

Related Questions in COMPILER-COMPILER

Popular Questions

Trending Questions