I would like to use a different lexer for tatsu, yet use tatsu's parser. Is this possible? For example, in the grammar:
expr = NUM | ID | (expr '+' expr) ;
is it possible to use an alternative lexer to provide NUM
and ID
?
I would like to use a different lexer for tatsu, yet use tatsu's parser. Is this possible? For example, in the grammar:
expr = NUM | ID | (expr '+' expr) ;
is it possible to use an alternative lexer to provide NUM
and ID
?
Recent versions of TatSu allow the use of a different lexer (called Tokenizer
in Tatsu).
The parser will probably have to rely on having semantic actions verity the grammar rules that correspond to tokens.
There are some unfinished experiments from my work helping with the Python PEG parser at https://github.com/neogeny/pygl.
In general, PEG parsers don't use a separate lexer because they don't need one. Lexical elements can be specified using the same grammar language.
TatSu, a PEG parser generator, doesn't support separate lexers either, yet the
Buffer
class provides facilities for avoiding partial matches of literal tokens and for specifying lexical elements using regular expressions: