Why doesn't Haskell need Trampolining?

2.1k views Asked by At

As Scala developer learning the IO Monad, and therefore technicalities of Trampolining in general that are necessary for recursion where tail call optimization is not possible, I wonder how Haskell seems to natively avoid it.

I get that Haskell is a lazy language, however I wonder if someone could elaborate a bit further.

For instance, why doesn't ForeverM stackoverflow in scala? Well, I can answer for trampoline, and I can find the actual code that does that in libraries and blogs. I actually implemented a basic trampoline myself to learn.

How does it happens in Haskell? Is there a way to unpack the laziness a bit, give some pointers, and maybe documentation that would help in understanding it better?

sealed trait IO[A] {

.....


  def flatMap[B](f: A => IO[B]): IO[B] =
    FlatMap[A,B](this, f) // we do not interpret the `flatMap` here, just return it as a value
  def map[B](f: A => B): IO[B] =
    flatMap[B](f andThen (Return(_)))

}
case class Return[A](a: A) extends IO[A]
case class Suspend[A](resume: () => A) extends IO[A]
case class FlatMap[A,B](sub: IO[A], k: A => IO[B]) extends IO[B]

......

@annotation.tailrec
def run[A](io: IO[A]): A = io match {
  case Return(a) => a
  case Suspend(r) => r()
  case FlatMap(x, f) => x match {
    case Return(a) => run(f (a))
    case Suspend(r) => run(f( r()))
    case FlatMap(y, g) => run(y flatMap (a => g(a) flatMap f))
  }
}
2

There are 2 answers

18
Levi Ramsey On

Functional programming in general requires tail-call elimination (otherwise the deep chains of function calls overflow the stack). For example, consider this (absurdly inefficient) implementation of an even/odd classifier:

def even(i: Int): Boolean =
  if (i == 0) true
  else if (i > 0) odd(i - 1)
  else odd(i + 1)

def odd(i: Int): Boolean =
  if (i == 0) false
  else if (i > 0) even(i - 1)
  else even(i + 1)

In both even and odd, every branch is either a simple expression (true or false in this case) which doesn't make a function call or a tail-call: the value of the called function is returned without being operated on.

Without tail-call elimination, the (potentially recursive with an indefinite length of a cycle) calls have to be implemented using a stack which consumes memory, because the caller may do something with the result. Tail-call elimination relies on observing that the caller doesn't do anything with the result, therefore the called function can effectively replace the caller on the stack.

Haskell and essentially every other post-Scheme functional language runtime implements generalized tail-call elimination: tail-calls become an unconditional jump (think a GOTO). The famous series of Steele and Sussman papers (the PDFs unfortunately didn't get archived, but you can search for, e.g. AIM-443 (mit or steele or sussman might be required)) known as "Lambda: The Ultimate" (which inspired the name of a programming language forum) goes through the implications of tail-call elimination and how this means that functional programming is actually viable for solving real-world computing problems.

Scala, however, primarily targets the Java Virtual Machine, the specification of which effectively (by design) prohibits generalized tail-call elimination, and the instruction set of which constrains unconditional jumps to not cross the boundaries of a method. In certain limited contexts (basically recursive calls of a method where the compiler can be absolutely sure of what implementation is being called), the Scala compiler performs the tail-call elimination before emitting the Java bytecode (it's theoretically conceivable that Scala Native could perform generalized tail-call elimination, but that would entail some semantic break with JVM and JS Scala (some JavaScript runtimes perform generalized tail-call elimination, though not V8 to my knowledge)). The @tailrec annotation, with which you may have some familiarity, enforces a requirement that the compiler be able to perform tail-call elimination.

Trampolining is a low-level technique at runtime for emulating compile-time tail-call elimination, especially in languages like C or Scala. Since Haskell has performed the tail-call elimination at compile-time, there's thus no need for the complexity of a trampoline (and the requirement to write the high-level code into continuation-passing style).

You can arguably think of the CPU in a Haskell program (or the runtime itself if transpiling to, e.g. JS) as implementing a trampoline.

0
jpaugh On

Trampolining is not the only solution for tail calls. Scala requires trampolining precisely because it runs on the JVM, with the Java runtime. The Scala language developers did not get to choose precisely how their runtime operates, nor their binary format. Because they use the JVM, they must endure every way that the JVM is optimized for Java and not for Scala.

Haskell does not have this limitation, because it has its own runtime, it's own binary format, etc. It can choose precisely how to set up the stack at runtime based on language-level constructs of the Haskell language --- not, of the Java one.