I'm trying to capture quoted strings without the quotes. I have this terminal
%token <string> STRING
and this production
constant:
| QUOTE STRING QUOTE { String($2) }
along with these lexer rules
| '\'' { QUOTE }
| [^ '\'']* { STRING (lexeme lexbuf) } //final regex before eof
It seems to be interpreting everything leading up to a QUOTE
as a single lexeme, which doesn't parse. So maybe my problem is elsewhere in the grammar--not sure. Am I going about this the right way? It was parsing fine before I tried to exclude quotes from strings.
Update
I think there may be some ambiguity with the following lexer rules
let name = alpha (alpha | digit | '_')*
let identifier = name ('.' name)*
The following rule is prior to STRING
| identifier { ID (lexeme lexbuf) }
Is there any way to disambiguate these without including quotes in the STRING
regex?
It's pretty normal to do semantic analysis in the lexer for constants like strings and numeric literals, so you might consider a lex rule for your string constants like