In OCaml, imperative-style loops can be exited early by raising exceptions.
While the use of imperative loops is not idiomatic per se in OCaml, I'd like to know what are the most idiomatic ways to emulate imperative loops with early exits (taking into account aspects such as performance, if possible).
For instance, an old OCaml FAQ mentions exception Exit
:
Exit
: used to jump out of loops or functions.
Is it still current? The standard library simply mentions it as a general-purpose exception:
The
Exit
exception is not raised by any library function. It is provided for use in your programs.
Relatedly, this answer to another question mentions using a precomputed let exit = Exit
exception to avoid allocations inside the loop. Is it still required?
Also, sometimes one wants to exit from the loop with a specific value, such as raise (Leave 42)
. Is there an idiomatic exception or naming convention to do this? Should I use references in this case (e.g. let res = ref -1 in ... <loop body> ... res := 42; raise Exit
)?
Finally, the use of Exit
in nested loops prevents some cases where one would like to exit several loops, like break <label>
in Java. This would require defining exceptions with different names, or at least using an integer to indicate how many scopes should be exited (e.g. Leave 2
to indicate that 2 levels should be exited). Again, is there an approach/exception naming that is idiomatic here?
As originally posted in comments, the idiomatic way to do early exit in OCaml is using continuations. At the point where you want the early return to go to, you create a continuation, and pass it to the code that might return early. This is more general than labels for loops, since you can exit from just about anything that has access to the continuation.
Also, as posted in comments, note the usage of
raise_notrace
for exceptions whose trace you never want the runtime to generate.A "naive" first attempt:
As you can see, the above implements early exit by hiding an exception. The continuation is actually a partially applied function that knows the unique id of the context for which it was created, and has a reference cell to store the result value while the exception is being thrown to that context. The code above prints 15. You can pass the continuation
k
as deep as you want. You can also define the functionf
immediately at the point where it is passed towith_early_exit
, giving an effect similar to having a label on a loop. I use this very often.The problem with the above is the result type of
'a cont
, which I arbitrarily set tounit
. Actually, a function of type'a cont
never returns, so we want it to behave likeraise
– be usable where any type is expected. However, this doesn't immediately work. If you do something liketype ('a, 'b) cont = 'a -> 'b
, and pass that down to your nested function, the type checker will infer a type for'b
in one context, and then force you to call continuations only in contexts with the same type, i.e. you won't be able to do things likebecause the first expression forces
'b
to beint
, but the second requires'b
to bestring
.To solve this, we need to provide a function analogous to
raise
for early return, i.e.This means stepping away from pure continuations. We have to un-partially-apply
make_cont
above (and I renamed it tothrow
), and pass the naked context around instead:The expression
throw k v
can be used in contexts where different types are required.I use this approach pervasively in some big applications I work on. I prefer it even to regular exceptions. I have a more elaborate variant, where
with_early_exit
has a signature roughly like this:where the first function represents an attempt to do something, and the second represents the handler for errors of type
'a
that may result. Together with variants and polymorphic variants, this gives a more explicitly-typed take on exception handling. It is especially powerful with polymorphic variants, as the set of error variands can be inferred by the compiler.The Jane Street approach effectively does the same as what is described here, and in fact I previously had an implementation that generated exception types with first-class modules. I am not sure anymore why I eventually chose this one – there may be subtle differences :)