Using ocamlyacc with sedlex

406 views Asked by At

I am trying to figure out how to use ocamlyacc with sedlex.

lexer.ml (using sedlex):

let rec lex (lexbuf: Sedlexing.lexbuf) =
    match%sedlex lexbuf with
    | white_space -> lex lexbuf
    (* ... other lexing rules ... *)
    | _ -> failwith "Unrecognized."

I also have an ocamlyacc file named parser.mly, which contains parse as one of the grammar rules.

To parse a string, I used this:

let lexbuf = Sedlexing.Utf8.from_string s in
let parsed = (Parser.parse Lexer.lex) lexbuf in
(* ... do things ... *)

But during compilation, this error appears (caused by the Lexer.lex above):

Error: This expression has type Sedlexing.lexbuf -> Parser.token but an expression was expected of type Lexing.lexbuf -> Parser.token Type Sedlexing.lexbuf is not compatible with type Lexing.lexbuf

From my understanding, this error appears because ocamlyacc expects the lexer to be generated by ocamllex, and not by sedlex. So the question is: how can I use ocamlyacc with sedlex?

1

There are 1 answers

0
octachron On

If you don't have a very specific reason to use ocamlyacc rather than Menhir, it is probably much simpler to use Menhir and convert the parsing function to the revised API by that requires only a token producer function of type unit -> token * position * position:

 let provider lexbuf () =
    let tok = generated_lexer lexbuf in
    let start, stop =  Sedlexing.lexing_positions lexbuf in
    tok, start, stop

 let parser_result = MenhirLib.Convert.Simplified.traditional2revised
     generated_parser_entry_point
     (provider lexbuf)

Otherwise, you need to create function Lexing.lexbuf -> token from your Sedlexing.lexbuf -> token which takes a dummy lexbuf as input, applies the true lexing function on the sedlex buffer, copies the location information to the dummy Lexing.lexbuf and then returns the token.