I'm converting an existing parser, from using a parser combinator to using a parser generator.
More specifically this is in a Haskell project, and the parser is moving from using megaparsec
to using alex
(flex
-like) and happy
(bison
-like). However, I think the question is more generally about the two approaches.
Considering the following if-then-else
statement:
if (condition) then_block else else_block
and with the important detail that conditions can be resolved at parsing time in this DSL (i.e. if-then-else
statements will always resolve to either branch at parsing time and never end up as if-then-else
in the AST),
the top-down nature of parser combinators allows me to parse correctly a partially incorrect statement, like:
if (4 > 2) {
<correct statements>
} else {
<correct statements>
<INCORRECT statement> /* for example containing unresolved placeholders */
}
That's because once the condition is evaluated as true
it is possible to consume the whole else_block
without actually parsing it, with something like this:
ifelseStmt
= do reserved "if"
e <- parens expression
if evalBoolExpr e then do
b <- block
option [] (reserved "else" *> unparsedBlock $> [])
return b
else do unparsedBlock
option [] (reserved "else" *> block)
unparsedBlock
= do hidden $ L.skipBlockCommentNested "{" "}"
whiteSpace
Note: it may seem strange to want this risky behaviour at all, but in the particular setting this DSL is used it is always guaranteed that the condition is bound to selecting the branch with no incorrect statements.
The question:
With the bottom-up approach of flex/bison
-type of tools, where lexing is basically independent of the parsing context, I don't see how to achieve the same result.
Is it possible at all?
The current production rule in happy
, which would fail to parse the else block in the example above:
ifelse_statement :: { [Statement] }
: 'if' '(' expr ')' statement_block 'else' statement_block {
if evalBoolExpr $3 then $5 else $7
}