Pegjs: Don't allow reserved keywords as a variable name

342 views Asked by At

I am writing my language in Pegjs and as usual, my language has some keywords, like true, false, if, else and today for instance. Now, I want to declare a variable, but apparently, the variable name cannot be one of the reserved keywords. It can be any alpha followed by an alpha-numeric, with the exception of the language keywords.

I did the following (testable in Pegjs Online):

variable = c:(alpha alphanum*)
{
 var keywords = ["true", "false", "if", "else", "today"];

  var res = c[0]
  for (var i = 0; i<c[1].length; i++) {
    res=res+c[1][i]
  }

  if(keywords.indexOf(res)>=0) {
    return error('\'' + res + '\'' + ' is a keyword and cannot be used as a variable name.');
  }

  return { 'dataType' : 'variable', 'dataValue' : res };
}

alpha = [a-zA-Z]
alphanum = [a-zA-Z0-9_]

boolean = v: ("true" / "false")
{
  return { 'dataType' : 'boolean', 'dataValue': v};
}

Now true is illegal, but true1 is not. This is fine. However, since I have defined the boolean structure somewhere else in my language, is it not possible to re-use that definition instead of manually re-defining the non-allowed keywords inside my variable definition?

You can imagine why my solution is error-prone. I tried several things but they did not work.

Thanks for your help!

1

There are 1 answers

0
bekroogle On

Simple Answer:

(See this code in action at http://peg.arcanis.fr/2VbQ5G/)

    variable = ! keyword (alpha alphanum*)
    {
      return { 'dataType' : 'variable', 'dataValue': text()};
    }

    keyword = "true" / "false" / "if" / "else" / "today"

    alpha = [a-zA-Z]
    alphanum = [a-zA-Z0-9_]

    boolean = ("true" / "false")
    {
      return { 'dataType' : 'boolean', 'dataValue': text()};
    }

Note: This loses your helpful error reporting. If I get a chance, I'll try to put up an answer that retains it.

The important bit of the code below is at the beginning of the variable rule: ! keyword. This easiest way to grok this as that the parser is looking ahead 1 token. If what it finds is not a keyword, then it allows the rule to try to match the token. If, on the other hand, it is a keyword, then the ! keyword expression (and by extension, the whole variable rule fails.

To quote David Majda's documentation:

! expression

Try to match the expression. If the match does not succeed, just return undefined and do not advance the parser position, otherwise consider the match failed.