Parse lambda calculus style function applications with LL1 parser

323 views Asked by At

I'm using TinyPG, which is an LL1 parser generator, to parse lambda calculus. I'm trying to write a rule that will parse function application like (a b) or (a b c) and so on.

So far I have this rule (a bit simplified):

APPLICATION -> LPARENTHESES VARIABLE (SPACE+ VARIABLE)+ RPARENTHESES;

But that'd fail to parse a term that has spaces after the left and before the right brackets: ( a b ). I can allow spaces after the opening bracket like this:

APPLICATION -> LPARENTHESES SPACE* VARIABLE (SPACE+ VARIABLE)+ RPARENTHESES;

But I'm having trouble setting it to allow spaces before the closing bracket. I came up with this, which seems to work:

ARG_LIST        -> (RPARENTHESES | (SPACE+ (RPARENTHESES | (VARIABLE ARG_LIST))));
APPLICATION     -> LPARENTHESES SPACE* VARIABLE ARG_LIST;

But is messy and recursive, which will then make it hard to read and compile the nodes. Is there any non recursive or at least simpler way to parse this?

1

There are 1 answers

0
rici On BEST ANSWER

There is no reason to confuse the parser with whitespace. It is sufficient for it to be ignored in the scanner using the [Skip] attribute as shown in the tutorial:

[Skip] WHITESPACE -> @"\s+";

"Skip" does not mean "delete". It means that the scanner should recognize the token and then ignore it. If you skip whitespace, whitespace will still separate alphanumeric tokens just fine. You just don't need to include the space token in your grammar, leaving you with:

APPLICATION -> LPARENTHESES VARIABLE VARIABLE+ RPARENTHESES;

(Actually, empty application lists are usually allowed, so I'd write that with a * instead of a +. But it's your language.)