I'm porting a grammar from scala combinators to antlr4, and the original grammar uses the 'not(p: Parser) ' parser combinator, which succeeds when the enclosed parser fails.
In the parser I am porting, I used the 'not' combinator to tell apart special comments starting with
'/*!'
from standard comments which start by
'/*'
while allowing standard comments (either multiline or end-of-line) within special comments, and also allowing comments nested in comments:
Below is the original scala code:
/* Annotation blocks with user defined contents. */
lazy val specialComment: PackratParser[Any] = specialCommentBegin ~> rep( not( multilineCommentEnd ) ~ ( comment | specialCommentContents ) ) ~ multilineCommentEnd
/* The whitespace parser, swallows both true whitespace and non-special comments. */
lazy val whitespaceParser: PackratParser[Any] = rep( whiteSpace | comment )
/* Multiline comment start delimiter. */
lazy val multilineCommentStart: PackratParser[Any] = not( specialCommentBegin ) ~ multilineCommentBegin
/* Nested multiline comments. */
lazy val multilineComment: PackratParser[Any] = multilineCommentStart ~ rep( not( multilineCommentEnd ) ~ ( comment | any ) ) ~ multilineCommentEnd
/* End of line comments. */
lazy val endOfLineComment: PackratParser[Any] = endOfLineCommentBegin ~ rep ( anyButEOL ) ~ "\n"
/* Matches everything except end of line. */
lazy val anyButEOL: PackratParser[Any] = not ( "\n" ) ~ any
/* Any comment. */
lazy val comment = multilineComment | endOfLineComment
Is there any equivalent to 'not' (either built-in symbol or design pattern) that would allow to solve the problem of parsing things like:
/* /*! this is an interpreted special comment */ that gets discarded because commented out */
or
/*! this is an interpreted special comment /* containing a comment */ */
or
/*! a special comment // with end-of-line comments
* which spans several lines // and again
* /* and again
over several lines
*/
*/
Thanks for your help!
ANTLR's lexer rules can also call themselves recursively. So you could make one big token from these special comments like this: