fslex learning: Lexer not advancing

154 views Asked by At

I am currently learning about lexing and parsing (based on the F# toolset) based on a parsing a simple calculation and I am stuck in that my lexer is not advancing to consume the whole string:

let lexeme = LexBuffer<_>.LexemeString
// ...
rule test = parse
  | digit+  { Console.WriteLine("1_" + (lexeme lexbuf)); test lexbuf; }
  | '+'     { Console.WriteLine("2_" + (lexeme lexbuf)); test lexbuf; }
  | '-'     { Console.WriteLine("3_" + (lexeme lexbuf)); test lexbuf; }
  | '*'     { Console.WriteLine("4_" + (lexeme lexbuf)); test lexbuf; }
  | '/'     { Console.WriteLine("5_" + (lexeme lexbuf)); test lexbuf; }
  | '('     { Console.WriteLine("6_" + (lexeme lexbuf)); test lexbuf; }
  | ')'     { Console.WriteLine("7_" + (lexeme lexbuf)); test lexbuf; }
  | eof     { () }

Note here e.g. the final 'test lexbuf' is necessary for me to write in order to ensure that the whole string I provide is consumed

Since I don't do that in my actual implementation I just get to read e.g. the first number, which is then all I get.

rule calculator = parse
  | digit+  { NUMBER (Convert.ToInt32(lexeme lexbuf)) }
  | '+'     { PLUS }
  | '-'     { MINUS }
  | '*'     { TIMES }
  | '/'     { DIV }
  | '('     { LPAREN }
  | ')'     { RPAREN }
  | eof     { EOF }

I have seen many examples structured quite similarly. What am I missing.

2

There are 2 answers

0
flq On BEST ANSWER

The problem was that you simply cannot expect from the lexer to advance on its own. To me, thinking of it as a stream helps understanding what is happening.

The advancing only works in combination with a parser. The parser will keep asking the lexer to return tokens.

0
Stephen Swensen On

I am guessing that you likely have whitespace and/or newlines in your text input, and therefore need rules to handle those (i.e. discard them by advancing the the lexbuf rather than producing a token). Something like:

let whitespace = [' ' '\t' ]
let newline = ('\n' | '\r' '\n')

...

| whitespace { calculator lexbuf }
| newline    { lexbuf.EndPos <- lexbuf.EndPos.NextLine; calculator lexbuf }