antlr4 Similar token definition

248 views Asked by At

I've a problem with tokens definition.

here is my grammar.

r: PROPNAME ':' PROPVALUE
PROPNAME: [a-zA-Z]+
PROPVALUE: [a-zA-Z0-9]+ 

if I use

name:christof123 it match

if I use

name:christof it doesn't match

Arguing 'christof' is PROPNAME lexer when PROPVALUE waited since 'christof' matches both PROPVALUE & PROPNAME expressions.

But I don' want matching on

name123:christof

Any Idea?

1

There are 1 answers

2
AudioBubble On

Like you said, the lexer will match christof with PROPNAME because that comes first in your definition that will match the longest. You can check the matches using grun.

antlr4 MyGrammer.g4
javac -g *.java
grun MyGrammer r -tokens
# enter your input string and press ctlr+d

Your grammar produces the below matches which gives the error.

name:christof
line 1:13 token recognition error at: '\n'
[@0,0:3='name',<2>,1:0]
[@1,4:4=':',<1>,1:4]
[@2,5:12='christof',<2>,1:5]
[@3,14:13='<EOF>',<-1>,2:0]
line 1:5 mismatched input 'christof' expecting PROPVALUE

So modifying your grammar to the below would solve.

r: name ':' value;

name: ALPHA;
value: ALPHA | ALPHANUM;

ALPHA: [a-zA-Z]+;
ALPHANUM: [a-zA-Z0-9]+;

Which produces the following match with grun.

name:christof
line 1:13 token recognition error at: '\n'
[@0,0:3='name',<2>,1:0]
[@1,4:4=':',<1>,1:4]
[@2,5:12='christof',<2>,1:5]
[@3,14:13='<EOF>',<-1>,2:0]