How to match keywords with pest?

362 views Asked by At

I'm trying to parse a line like this

MyTupleComponent str, str

with grammar

cname = _{ (ASCII_ALPHANUMERIC | "_")+ }

ints = { "i8" | "i16" | "i32" | "i64" | "i128" | "isize" }
uints = { "u8" | "u16" | "u32" | "u64" | "u128" | "usize" }
strings = { "str" | "String" }
types = { strings | ints | uints }

tuple_component = { cname ~ (types ~ ("," ~ types)?)+ }

But end up with

Err(Error { variant: ParsingError { positives: [types], negatives: [] }, location: Pos(20), line_col: Pos((1, 21)), path: None, line: "MyTupleComponent str, str", continued_line: None })

Anyone know why the rule don't matches correctly?

1

There are 1 answers

0
Victor Sergienko On

You can take two roads:

As @ZachThompson pointed out, define WHITESPACE. If you do, make cname atomic, to prevent it from capturing alphanumerics AND spaces.

This test grammar seems fine:

WHITESPACE = _{ " " }
cname = @{ (ASCII_ALPHANUMERIC | "_")+ }

ints = { "i8" | "i16" | "i32" | "i64" | "i128" | "isize" }
uints = { "u8" | "u16" | "u32" | "u64" | "u128" | "usize" }
strings = { "str" | "String" }
types = { strings | ints | uints }

tuple_component = { cname ~ (types ~ ("," ~ types)?)+ }

file = { tuple_component ~ EOI }

Otherwise, you can account for spaces manually. This approach would work too, but it's not scalable with a growth of a grammar.

P.S. Is your intention to parse expressions like MyTupleComponent str, str str, str, without the comma to separate a second tuple from the first? It currently parses fine. You may want to simplify the rule to

tuple_component = { cname ~ types ~ ("," ~ types)* }