I am trying to understand ANTLR predicates. To that end, I have a simple lexer and parser, shown below.
What I would like to do is use a predicate to insert the word "fubar" every time it sees "foo" followed by some whitespace and then "bar". I want to do this while keeping the same basic structure. Bonus points for doing it in the lexer. Further bonus points if I can do it without referring to the underlying language at all. But if necessary, it is C#.
For example, if the input string is:
programmers use the words foo bar and bar foo class
the output would be
programmers use the words foo fubar bar and bar foo class
Lexer:
lexer grammar TextLexer;
@members
{
protected const int EOF = Eof;
protected const int HIDDEN = Hidden;
}
FOO: 'foo';
BAR: 'bar';
TEXT: [a-z]+ ;
WS
: ' ' -> channel(HIDDEN)
;
Parser:
parser grammar TextParser;
options { tokenVocab=TextLexer; }
@members
{
protected const int EOF = Eof;
}
file: words EOF;
word:FOO
|BAR
|TEXT;
words: word
| word words
;
compileUnit
: EOF
;
ANTLR3's lexer might have needed a predicate in this case, but ANTLR4's lexer is much "smarter". You can match "foo bar" in a single lexer rule and change its inner text with
setText(...)
: