I am writing a grammar to handle scalar and vector expressions. The grammar below is simplified to show the problem I have where a scalar expression can be derived from a vector and a vector can be derived from a scalar. For example, a vector could be a literal [1, 2, 3]
or the product of a scalar and a vector 2 * [1, 2, 3]
(equivalent to [2, 4, 6]
). A scalar could be a literal 2
or an index into a vector [1, 2, 3][1]
(equivalent to 2
).
grammar LeftRecursion;
Integer
: [0-9]+
;
WhiteSpace
: [ \t\r\n]+ -> skip
;
input
: expression EOF;
expression
: scalar
| vector
;
scalar
: Integer
| vector '[' Integer ']'
;
vector
: '[' Integer ',' Integer ',' Integer ']'
| scalar '*' vector
;
ANTLR4 gives me the error: The following sets of rules are mutually left-recursive [scalar, vector]
. This makes sense because scalar
references vector
and vice-versa, but at the same time it should be deterministic.
How would I refactor this grammar to avoid the mutual (indirect) left-recursion? I could expand one of the terms inplace, but that would introduce a lot of duplication in the full grammar where there are more alternatives for vector and scalar. I could also refactor the grammar to have a primary expression, but I don't want to allow scalar '*' scalar
as a valid vector
alternative. Are there other options?
gives that you could write an expressions
which would render a stack overflow of the parser for java and other languages with limited stack-depth.
I think you should create a different grammatical rule for vector-lookups, it's not a scalar, it just results in a scalar, but this should be handled in the parser tree handling, not in ANTLR.