I stumbled upon this problem while writing grammar rules for create table. My grammar is failing when column names are already defined tokens in the grammar (can't have column name 'create' matched with 'create' keyword!
Simple UseCase :
grammar hello;
start :
'hello' 'world' ID
;
ID : 'a'..'z'+ ;
WS : (' '|'\n'|'\r')+ {$channel=HIDDEN;} ;
For this grammar how do I make "Hello World Hello" as a valid input. Currently it is failing with MissingTokenException.
AST
root
|
start
__________________________________
| | |
hello World MissingTokenException
Thanks in advance.
EDIT:
I have found this inline-rule while definition rule for "hello" & "world", still to find how it works.
grammar hello;
stat: keyHELLO keyWORLD expr
;
expr: ID
;
/** An ID whose text is "hello" */
keyHELLO : {input.LT(1).getText().equals("hello")}? ID ;
/** An ID whose text is "world" */
keyWORLD : {input.LT(1).getText().equals("world")}? ID ;
// END:rules
ID : 'a'..'z'+ ;
WS : (' '|'\n'|'\r')+ {$channel=HIDDEN;} ;
AST
root
|
start
__________________________________
| | |
keyHello keyWorld expr
| | |
hello world world
Hope it might help.
When you are parsing a language, they always have "reserved words". It's almost impossible to work without them. In your case, you have two options:
Define a group of reserved words and make your ID an extension of it. I don't recommend you this possibility because this would be a terrible mess when you work with an entire grammar, or when you use your lexer-parser-tree, and you want to make something different with some tokens or reserved words (maybe skip them).
In this case you will be able to be sure when you are parsing a RW, but not an ID, because ID and RW go through the same rule...
Taking the case of the second option you can define one subgroup apart from these news ID's I tell you. You can also establish a difference between a selected group of words from your vocabulary ('world','hello' etc.) and deal with them different in your tree.
I hope this would help you!!