I have an ANTLR 4 lexer grammar with a BEGIN lexer rule and an ID lexer rule:
lexer grammar Begin;
BEGIN : 'begin' ;
ID : [a-z]+ ;
WS : [ \t\r\n]+ -> skip ;
After generating the lexer and compiling, I ran the ANTLR TestRig tool with input 'begin'
:
grun Begin tokens -tokens
begin
^Z
I got this output:
[@0,0:4='begin',<1>,1:0]
[@1,7:6='<EOF>',<-1>,2:0]
Notice the token type is 1 (as <1> indicates).
I ran it again, this time with input 'beginning'
:
grun Begin tokens -tokens
beginning
^Z
I got this output:
[@0,0:8='beginning',<1>,1:0]
[@1,11:10='<EOF>',<-1>,2:0]
Why do I get the same token type? Does that mean the lexer is using the same lexer rule for both inputs?
How do I get TestRig to show me that the lexer uses this rule: BEGIN : 'begin' ;
for tokenizing this input: begin
and this rule: ID : [a-z]+ ;
for tokenizing this input: beginning
I used the following test setup:
with ANTLRWorks 2.1. It works as expected:
with 'begin':
with 'beginning':