PEST Parser does not recognise comments

335 views Asked by At

I'm trying to write a parser with PEST the Rust parser generator. I'm having trouble with a fairly simple grammar. file is the top level rule in the grammar. It contains the SOI and EOI rules.

// example.pest

WHITESPACE = _ { "\n" | " " }
COMMENT = _{ "(*" ~ ANY* ~ "*)" }

KEYWORD = { ^"keyword" }

file = _{ SOI ~ KEYWORD ~ EOI }

Here is the contents of the file I'm trying to parse:

(*
*)
keyword

The generated parser cannot parse this file. The error looks like this:

1 | (*␊
  | ^---
  |
  = expected KEYWORD

The built in COMMENT rule should handle this situation. Is whitespace handled differently inside comments?

How to properly write a grammar with comments?

1

There are 1 answers

0
oorst On

There is actually an error in the logic of the grammar as given here. This rule in the grammar will match everything to the end of the file.

COMMENT = _{ "(*" ~ ANY* ~ "*)" }

The rule should be

COMMENT = _{ "(*" ~ (!"*)" ~ ANY)* ~ "*)" }

This means that any number of characters will be matched, but not anything that looks like *). Once *) is encountered, the next part of the sequence is reached and *) is matched and the whole rule is fulfilled.