Should lexer rules be unambiguous in Antlr4?
Suppose I would like to parse dates and defined
hour: DIGIT09 | (DIGIT1 DIGIT09) | (DIGIT2 DIGIT04);
month: DIGIT19 | (DIGIT1 DIGIT02);
DIGIT12: '1'..'2';
DIGIT1: '1';
DIGIT2: '2';
DIGIT19: '1'..'9';
DIGIT09: '0'..'9';
DIGIT04: '0'..'4';
DIGIT04: '0'..'2';
Here I defined digit ranges in lexer. But looks like it doesn't work, since they are ambiguous.
Can I define ranges in parser instead of lexer?
This type of validation is best performed in a listener or visitor which executes after a parse tree is created. Start with just a number:
Then define
hour
andmonth
based on this:After you have a parse tree, implement
enterHour
andenterMonth
to validate that theNUMBER
contained in each is valid.This approach yields the best combination of error recovery and error reporting in the event the user enters incorrect input.