F# fslex fsyacc mature for production code?

2.8k views Asked by At

After reading a 2 year old webpage really ripping fslex/fsyacc, buggy, slow, stupid etc. compared to their OCamel counterparts i wonder what would be ones best bet for lexing parsing needs?

Ive used ANTLR before with C# bindings but am currently in the process of learning F# and was excited when i saw it came with a parser generator. Since F# is now officaly released and it seems something Microsoft is really aiming to support and develop. Would you say fslex and fsyacc is worth it for production code?

3

There are 3 answers

2
Tomas Petricek On

Fslex and fsyacc are certainly ready for production use. After all, they are used in Microsoft Visual Studio 2010, because the F# lexer and parser are written using them (The F# compiler source code is also a good example that demonstrates how to use them efficiently).

I'm not sure how fslex/fsyacc compare to their OCaml equivalents or with ANTLR. However, Frederik Holmstrom has an article that compares ANTLR with hand-written parser written in F# used in IronJS. Unfortunatelly, he doesn't have fslex/fsyacc version, so there is no direct comparison.

To answer some specific concerns - you can get MSBUILD tasks for running fslex/fsyacc as part of the build, so it integrates quite well. You don't get a syntax highlighting, but I don't think that's such a big deal. It may be slower than OCaml version, but that affects the compilation only when you change the parser - I did some modifications to the F# parser and didn't find the compilation time a problem.

0
Laurent On

Fslex and fsyacc are used by the F# compiler, so they kind of work. I have used them a few years ago, it was good enough for my needs.

However, my experience is that lex/yacc is much less mature in F# than in OCaml. Many people in the OCaml community have used them for years, including many students (it seems like writing a small interpreter/compiler with them is a common exercise). I don't think many F# developers have used them, and I don't think the F# team has done a lot of work on these tools recently (for instance, VS integration has not been a priority). If you're not very exigent, Fslex and fsyacc could be enough for you.

A solution could be to adapt Menhir (a camlyacc replacement with several nice features) to use it with F#. I have no idea how much work it would be.

Personally, I now use FParsec every time I need to write a parser. It's quite different to use, but it's also much more flexible and it generates good parse error messages. I've been very happy with it and its author has always been very helpful when I had questions.

0
J D On

The fslex and fsyacc tools were specifically written for the F# compiler and were not intended for wider use. That said, I have managed to get significant code bases ported from OCaml to F# thanks to these tools but it was laborious due to the complete lack of VS integration on the F# side (OCaml has excellent integration with syntax highlighting, jump to definition and error throwback). In particular, I moved as much of the F# code out of the lexer and parser as possible.

We have often needed to write parsers and have asked Microsoft to add official support for fslex and fsyacc but I do not believe this will happen.

My advice would be to use fslex and fsyacc only if you are facing translating a large legacy OCaml code base that uses ocamllex and ocamlyacc. Otherwise, write a parser from scratch.

I am personally not a fan of parser combinator libraries and prefer to write parsers using active patterns that look something like this s-expression parser:

let alpha = set['A'..'Z'] + set['a'..'z']
let numeric = set['0'..'9']
let alphanumeric = alpha + numeric

let (|Empty|Next|) (s: string, i) =
  if i < s.Length then Next(s.[i], (s, i+1)) else Empty

let (|Char|_|) alphabet = function
  | Empty -> None
  | s, i when Set.contains s.[i] alphabet -> Some(s, i+1)
  | _ -> None

let rec (|Chars|) alphabet = function
  | Char alphabet (Chars alphabet it)
  | it -> it

let sub (s: string, i0) (_, i1) =
  s.Substring(i0, i1-i0)

let rec (|SExpr|_|) = function
  | Next ((' ' | '\n' | '\t'), SExpr(f, it)) -> Some(f, it)
  | Char alpha (Chars alphanumeric it1) as it0 -> Some(box(sub it0 it1), it1)
  | Next ('(', SExprs(fs, Next(')', it))) -> Some(fs, it)
  | _ -> None
and (|SExprs|) = function
  | SExpr(f, SExprs(fs, it)) -> box(f, fs), it
  | it -> null, it

This approach does not require any VS integration because it is just vanilla F# code. I find it easy to read and maintainable. Performance has been more than adequate in my production code.