Can LALR parser generators optionally consume unparsable input?

135 views Asked by cornuz At 02 November 2023 at 16:23

I'm converting an existing parser, from using a parser combinator to using a parser generator.

More specifically this is in a Haskell project, and the parser is moving from using megaparsec to using alex (flex-like) and happy (bison-like). However, I think the question is more generally about the two approaches.

Considering the following if-then-else statement:

if (condition) then_block else else_block

and with the important detail that conditions can be resolved at parsing time in this DSL (i.e. if-then-else statements will always resolve to either branch at parsing time and never end up as if-then-else in the AST), the top-down nature of parser combinators allows me to parse correctly a partially incorrect statement, like:

if (4 > 2) {
  <correct statements>
} else {
  <correct statements>
  <INCORRECT statement>  /* for example containing unresolved placeholders */
}

That's because once the condition is evaluated as true it is possible to consume the whole else_block without actually parsing it, with something like this:

ifelseStmt
  = do  reserved "if"
        e <- parens expression
        if evalBoolExpr e then do
          b <- block
          option [] (reserved "else" *> unparsedBlock $> [])
          return b
        else do unparsedBlock
                option [] (reserved "else" *> block)

unparsedBlock
  = do  hidden $ L.skipBlockCommentNested "{" "}"
        whiteSpace

Note: it may seem strange to want this risky behaviour at all, but in the particular setting this DSL is used it is always guaranteed that the condition is bound to selecting the branch with no incorrect statements.

The question:

With the bottom-up approach of flex/bison-type of tools, where lexing is basically independent of the parsing context, I don't see how to achieve the same result. Is it possible at all?

The current production rule in happy, which would fail to parse the else block in the example above:

ifelse_statement :: { [Statement] }
  : 'if' '(' expr ')' statement_block 'else' statement_block {
      if evalBoolExpr $3 then $5 else $7
    }

Original Q&A

TechQA.

Can LALR parser generators optionally consume unparsable input?

There are 0 answers

Related Questions in PARSING

Related Questions in HASKELL

Related Questions in HAPPY

Related Questions in ALEX

Popular Questions

Popular Tags

Trending Questions