How to enforce associativity rules in a GLR parser?

260 views Asked by At

I'm writing a GLR for fun (again, because I understood a few things since my last try). The parser is working now and I am implementing disambiguation rules. I am handling precedence in a way that seems to work. Now I'm a bit at loss regarding associativity.

Say I have a grammar like this :

E <- E '+' E (rule 1)
E <- E '-' E (rule 2)
E <- '0'     (rule 3)
E <- '1'     (rule 4)

Where rules 1) and 2) have the same precedence and left associativity.

Without associativity handling, the string '1-1+0' will generate two parse trees:

   1                2
  / \              / \
 /   \            /   \
2     3          4     1
|  \                   | \
4   4                  4  3

Where numbers indicate the rule used for reduction. The correct parse tree is the first one and thus I would like to keep only this one.

I'm wondering how to efficiently detect associativity infringements algorithmically.

One approach I tried was to see that in the first tree, at the top node, rule 2 is LEFT of rule 3 in the list of children of rule 1, whereas in the second tree rule 1 is RIGHT of rule 4 and thus since rules 2 and 1 are LEFT associative I keep only the first tree.

This, however, did not get me very far in more complicated examples. A limitation of this solution is that I can only discard trees based on a comparison with another tree.

Do you think I can find a solution using a refined version of this approach? What is the standard way of doing?

2

There are 2 answers

1
Gunther On

In my opinion this is best expressed by integrating into the grammar rules, completely resolving the ambiguity:

E <- F
E <- E '+' F
E <- E '-' F
F <- '0'
F <- '1'

As you are set for (G)LR, it should be possible to equally well express left- and right-associativity. The increase in depth of parse trees, due to unit derivations, can be addressed by postprocessing them appropriately.

This will completely avoid inventing a new mechanism, and exploit the expressiveness of the BNF that is used anyway. I think it requires strong arguments to instead favor an ambiguous notation, plus a separate specification of how to resolve.

The XQuery language specification, during its definition process, evolved from using ambiguous EBNF with extra disambiguation rules (see April 30, 2002 draft) to dropping the latter in favor of unambiguous rules incorporating precedence and associativity (see August 16, 2002 draft). As an implementer, I very much appreciated that - it made my life easier.

0
user764754 On

To do this algorithmically I would make two groups: SIMPLE which includes rule 3 and 4 and COMPLEX which includes rule 1 and 2. If the rightmost child of a (COMPLEX) (sub)root is COMPLEX then remove this tree because it is (partly) right-associative.