Grammar: Precedence of grammar alternatives

99 views Asked by At

This is a very basic question about grammar alternatives. If you have the following alternative:

Myalternative: 'a' | .;
Myalternative2: 'a' | 'b';

Would 'a' have higher priority over the '.' and over 'b'?

I understand that this may also depend on the behaviour of the parser generated by this syntax but in pure theoretical grammar terms could you imagine these rules being matched in parallel i.e. test against 'a' and '.' at the same time and select the one with highest priority? Or is the 'a' and . ambiguous due to the lack of precedence in grammars?

1

There are 1 answers

0
rici On BEST ANSWER

The answer depends primarily on the tool you are using, and what the semantics of that tool is. As written, this is not a context-free grammar in canonical form, and you'd need to produce that to get a theoretical answer, because only in that way can you clarify the intended semantics.

Since the question is tagged , I'm going to guess that this is part of an Antlr lexical definition in which . is a wildcard character. In that case, 'a' | . means exactly the same thing as ..

Since MyAlternative matches everything that MyAlternative2 matches, and since MyAlternative comes first in the Antlr lexical definition, MyAlternative2 can never match anything. Any single character will be matched by MyAlternative (unless there is some other lexical rule which matches a longer sequence of input characters).

If you put the definition of MyAlternative2 first in the grammar file, then a or b would be matched as MyAlternative2, while any other character would be matched as MyAlternative.

The question of precedence within alternatives is meaningless. It doesn't matter whether MyAlternative considers the match of an a to be a match of a or a match of .. It is, in the end, a match of MyAlternative, and that symbol can only have one associated action.

Between lexical rules, there is a precedence relationship: The first one wins. (More accurately, as far as I know, Antlr obeys the usual convention that the longest match wins; between two rules which both match the same longest sequence, the first one in the grammar file wins.) That is not in any way influenced by alternative bars in the rules themselves.