I am currently learning about lexing and parsing (based on the F# toolset) based on a parsing a simple calculation and I am stuck in that my lexer is not advancing to consume the whole string:
let lexeme = LexBuffer<_>.LexemeString
// ...
rule test = parse
| digit+ { Console.WriteLine("1_" + (lexeme lexbuf)); test lexbuf; }
| '+' { Console.WriteLine("2_" + (lexeme lexbuf)); test lexbuf; }
| '-' { Console.WriteLine("3_" + (lexeme lexbuf)); test lexbuf; }
| '*' { Console.WriteLine("4_" + (lexeme lexbuf)); test lexbuf; }
| '/' { Console.WriteLine("5_" + (lexeme lexbuf)); test lexbuf; }
| '(' { Console.WriteLine("6_" + (lexeme lexbuf)); test lexbuf; }
| ')' { Console.WriteLine("7_" + (lexeme lexbuf)); test lexbuf; }
| eof { () }
Note here e.g. the final 'test lexbuf'
is necessary for me to write in order to ensure that the whole string I provide is consumed
Since I don't do that in my actual implementation I just get to read e.g. the first number, which is then all I get.
rule calculator = parse
| digit+ { NUMBER (Convert.ToInt32(lexeme lexbuf)) }
| '+' { PLUS }
| '-' { MINUS }
| '*' { TIMES }
| '/' { DIV }
| '(' { LPAREN }
| ')' { RPAREN }
| eof { EOF }
I have seen many examples structured quite similarly. What am I missing.
The problem was that you simply cannot expect from the lexer to advance on its own. To me, thinking of it as a stream helps understanding what is happening.
The advancing only works in combination with a parser. The parser will keep asking the lexer to return tokens.