I am trying to build a Boolean logic parser e.g. A == B AND C == D
to output something like And(Equals(A,B), Equals(C,D))
My parser has the following definitions:
def program: Parser[Operator] = {
phrase(operator)
}
def operator: PackratParser[Operator] = {
leaf | node
}
def node: PackratParser[Operator] = {
and | or
}
def leaf: PackratParser[Operator] = {
equal | greater | less
}
def and: PackratParser[Operator] = {
(operator ~ ANDT() ~ operator) ^^ {
case left ~ _ ~ right => And(left, right)}
}
I would expect the parser to map to program -> operator -> node -> and -> operator (left) -> leaf -> equal -> operator (right) -> leaf -> equal
. This doesn't work.
However if in the above code I do the changes
def operatorWithParens: PackratParser[Operator] = {
lparen ~> (operator | operatorWithParens) <~ rparen
}
and change and
to be
def and: PackratParser[Operator] = {
(operatorWithParens ~ ANDT() ~ operatorWithParens) ^^ {
case left ~ _ ~ right => And(left, right)}
}
Parsing (A == B) AND (C == D)
succeeds.
I can not wrap my head around why the former doesn't work while the later does.
How should I change my code to be able to parse A == B AND C == D
?
EDIT: Following @Andrey Tyukin advice I've modified the gramma to account for precedence
def program: Parser[Operator] = positioned {
phrase(expr)
}
def expr: PackratParser[Operator] = positioned {
(expr ~ ORT() ~ expr1) ^^ {
case left ~ _ ~ right => Or(left, right)} | expr1
}
def expr1: PackratParser[Operator] = positioned {
(expr1 ~ ANDT() ~ expr2) ^^ {
case left ~ _ ~ right => And(left, right)} | expr2
}
def expr2: PackratParser[Operator] = positioned {
(NOTT() ~ expr2) ^^ {case _ ~ opr => Not(opr)} | expr3
}
def expr3: PackratParser[Operator] = {
lparen ~> (expr) <~ rparen | leaf
}
And although PackratParser
supports left-recursive grammar, I run into an infinite loop that never leaves expr
It looks like there is a path from
operator
to a shorteroperator
:You seem to be assuming that the shorter
operator (left)
will somehow reduce toleaf
, whereas the outermostoperator
would skip theleaf
and pick thenode
, for whatever reason. What it does instead is just chocking on the firstleaf
it encounters.You could try to move the
node
before theleaf
, so that the wholeoperator
doesn't choke on the firstA
when seeing sth. likeA == B AND ...
.Otherwise, I'd suggest to refactor it into
where atomic formulas are either
Expect to use quite a few
repSep
s.