In pest.rs, pest-ast crate, how do I derive an enum fields?

408 views Asked by At

I have an example Pest grammar:

WHITESPACE = _{ " " }
identifier = @{ ASCII_ALPHA ~ (ASCII_ALPHANUMERIC | "_")* }
int_literal = { DECIMAL_NUMBER+ }

assignment_op = { ":=" }
formula = { (identifier ~ assignment_op ~ int_literal) | int_literal }

file = { formula ~ EOI }

and a pest-ast derives:

extern crate pest_derive;
extern crate from_pest;
extern crate pest_ast;
extern crate pest;


mod parser {
    #[derive(Parser)]
    #[grammar = "talk/formula.pest"]
    pub struct Parser;
}


mod ast {
    use super::parser::Rule;
    use pest::Span;

    fn span_into_str(span: Span) -> &str {
        span.as_str()
    }

    #[derive(Debug, FromPest)]
    #[pest_ast(rule(Rule::int_literal))]
    pub struct IntLiteral {
        #[pest_ast(outer(with(span_into_str), with(str::parse::<i64>), with(Result::unwrap)))]
        pub value: i64
    }

    #[derive(Debug, FromPest)]
    #[pest_ast(rule(Rule::identifier))]
    pub struct Identifier {
        #[pest_ast(inner(with(span_into_str), with(String::from)))]
        pub value: String
    }

    #[derive(Debug, FromPest)]
    #[pest_ast(rule(Rule::assignment_op))]
    pub struct AssignmentOp {
    }

    #[derive(Debug, FromPest)]
    #[pest_ast(rule(Rule::formula))]
    pub enum Formula {
        Assignment {
            lvalue: Identifier,
            a: AssignmentOp, // can I skip this?
            rvalue: IntLiteral,
        },
        IntLiteral {
            rvalue: IntLiteral,
        }
    }

#[cfg(test)]
mod tests {
    use super::*;
    use super::ast::*;
    use pest::Parser;
    use from_pest::FromPest;

    #[test]
    fn test_formula0() {
        let source = "a := 12";
        let mut parse_tree = parser::Parser::parse(parser::Rule::formula, source).unwrap();
        println!("parse tree = {:#?}", parse_tree);
        let syntax_tree: Formula = Formula::from_pest(&mut parse_tree).expect("infallible");
        println!("syntax tree = {:#?}", syntax_tree);
    }
}

Running the test, I'm getting infallible: NoMatch panic.

  • Does pest-ast even support deriving enum variants with fields?
  • Can I match enum variant to a parenthesed () group of terminals?
  • Can I skip some terminals? I don't exactly need to know := was used if I get an AssignmentExpression { lvalue, rvalue } in the end.
1

There are 1 answers

0
Victor Sergienko On

I found an example in pest-ast issue #8. Grammar rules:

seq = { a ~ b ~ c }
choice = { a | b | c }
compund_seq = { a ~ (b | c) }
compound_choice = { (a ~ b) | (b ~ c) }
assign = { (a|b|c) ~ "=" ~ number }
assigns = { (assign ~ ",")* ~ assign ~ ","? }

correspond to code:

enum choice<'pest>{
  struct _1(a<'pest>),
  struct _2(b<'pest>),
  struct _3(c<'pest>),
}
struct compound_seq<'pest>(
  #[pest_ast(outer)] Span<'pest>,
  a<'pest>,
  enum _2 {
    struct _1(b<'pest>),
    struct _2(c<'pest>),
  },
);
enum compound_choice<'pest>{
  struct _1(
    #[pest_ast(outer)] Span<'pest>,
    a<'pest>,
    b<'pest>,
  ),
  struct _2(
    #[pest_ast(outer)] Span<'pest>,
    b<'pest>,
    c<'pest>,
  ),
}
struct assign<'pest>(
  #[pest_ast(outer)] Span<'pest>,
  enum _1 {
    struct _1(a<'pest>),
    struct _2(b<'pest>),
    struct _3(c<'pest>),
  },
  number<'pest>,
);
struct assigns<'pest>(
  #[pest_ast(outer)] Span<'pest>,
  Vec<struct _1(assign<'pest>)>,
  assign<'pest>,
);

Once I knew I was on the right track, I figured out the error in my code, completely unrelated to the question asked. It was that Identifier rule should use outer span instead of inner.

#[derive(Debug, FromPest)]
#[pest_ast(rule(Rule::identifier))]
pub struct Identifier {
    #[pest_ast(outer(with(span_into_str), with(String::from)))]
    pub value: String
}

The most useful debugging tool was to print the raw syntax tree the Identifier rule produced:

#[test]
fn test_identifier() {
    let source = "foobar";
    let mut parse_tree = parser::Parser::parse(parser::Rule::identifier, source).unwrap();
    println!("parse tree = {:#?}", parse_tree);
    let syntax_tree: Identifier = Identifier::from_pest(&mut parse_tree).expect("infallible");
    println!("syntax tree = {:#?}", syntax_tree);
    assert_eq!(syntax_tree.value, "foobar".to_string());
}

I also had to remove struct inside an enum to have Formula compile:

#[derive(Debug, FromPest)]
#[pest_ast(rule(Rule::formula))]
pub enum Formula {
    Assignment {
        lvalue: Identifier,
        // a: AssignmentOp,
        rvalue: IntLiteral,
    },
    OrTest {
        or_test: IntLiteral,
    }
}

The answers to the questions:

Does pest-ast even support deriving enum variants with fields?

Yes, example above.

Can I match enum variant to a parenthesed () group of terminals?

No answer yet. This hasn't worked for me.

Can I skip some terminals? I don't exactly need to know := was used if I get an AssignmentExpression { lvalue, rvalue } in the end.

pest-ast works with the tree produced by pest. In order to skip something, make it a silent rule in the source grammar.