Why does let not allow mutually recursive definitions, whereas letrec can?

458 views Asked by At

I suspect that I fundamentally misunderstand Scheme's evaluation rules. What is it about the way that let and letrec are coded and evaluated that makes letrec able to accept mutually recursive definitions whereas let cannot? Appeals to their basic lambda forms may be helpful.

3

There are 3 answers

2
ignis volens On BEST ANSWER

let can't create mutually-recursive functions in any obvious way because you can always turn let into lambda:

(let ((x 1))
  ...)
-->
((λ (x)
   ...)
 1)

and similarly for more than one binding:

(let ((x 1)
      (y 2))
  ...)
-->
((λ (x y)
   ...)
 1 2)

Here and below, --> means 'can be translated into' or even 'could be macroexpanded into'.

OK, so now consider the case where the x and y are functions:

(let ((x (λ (...) ...))
      (y (λ (...) ...)))
  ...)
-->
((λ (x y)
   ...)
 (λ (...) ...)
 (λ (...) ...))

And now it's becoming fairly clear why this can't work for recursive functions:

(let ((x (λ (...) ... (y ...) ...))
      (y (λ (...) ... (x ...) ...)))
  ...)
-->
((λ (x y)
   ...)
 (λ (...) (y ...) ...)
 (λ (...) (x ...) ...))

Well, let's make this more concrete to see what goes wrong: let's consider a single recursive function: if there's a problem with that then there certainly will be problems with sets of mutually recursive functions.

(let ((factorial (λ (n)
                   (if (= n 1) 1
                       (* n (factorial (- n 1)))))))
  (factorial 10))
-->
((λ (factorial)
   (factorial 10))
 (λ (n)
   (if (= n 1) 1
       (* n (factorial (- n 1))))))

OK, what happens when you try to evaluate the form? We can use the environment model as described in SICP . In particular consider evaluating this form in an environment, e, in which there is no binding for factorial. Here's the form:

((λ (factorial)
   (factorial 10))
 (λ (n)
   (if (= n 1) 1
       (* n (factorial (- n 1))))))

Well, this is just a function application with a single argument, so to evaluate this you simply evaluate, in some order, the function form and its argument.

Start with the function form:

(λ (factorial)
  (factorial 10))

This just evaluates to a function which, when called, will:

  1. create an environment e' which extends e with a binding of factorial to the argument of the function;
  2. call whatever is bound to factorial with the argument 10 and return the result.

So now we have to evaluate the argument, again in the environment e:

(λ (n)
  (if (= n 1) 1
      (* n (factorial (- n 1)))))

This evaluates to a function of one argument which, when called, will:

  1. establish an environment e'' which extends e with a binding of n to the argument of the function;
  2. if the argument isn't 1, will try to call some function bound to a variable called factorial, looking up this binding in e''.

Hold on: what function? There is no binding of factorial in e, and e'' extends e (in particular, e'' does not extend e'), but by adding a binding of n, not factorial. Thus there is no binding of factorial in e''. So this function is an error: you will either get an error when it's evaluated or you'll get an error when it's called, depending on how smart the implementation is.

Instead, you need to do something like this to make this work:

(let ((factorial (λ (n) (error "bad doom"))))
  (set! factorial
        (λ (n)
          (if (= n 1) 1
              (* n (factorial (- n 1))))))
  (factorial 10))
-->
((λ (factorial)
   (set! factorial
         (λ (n)
           (if (= n 1) 1
               (* n (factorial (- n 1))))))
   (factorial 10))
 (λ (n) (error "bad doom")))

This will now work. Again, it's a function application, but this time all the action happens in the function:

(λ (factorial)
  (set! factorial
        (λ (n)
          (if (= n 1) 1
              (* n (factorial (- n 1))))))
  (factorial 10))

So, evaluating this in e results in a function which, when called will:

  1. create an environment e', extending e, in which there is a binding of factorial to whatever its argument is;
  2. mutate the binding of factorial in e', assigning a different value to it;
  3. call the value of factorial in e', with argument 10, returning the result.

So the interesting step is (2): the new value of factorial is the value of this form, evaluated in e':

(λ (n)
  (if (= n 1) 1
      (* n (factorial (- n 1)))

Well, this again is a function which, when called, will:

  1. create an environent, e'', which extends e' (NOTE!) with a binding for n;
  2. if the value of the binding of n is not 1, call whatever is bound to factorial in the e'' environment.

And now this will work, because there is a binding of factorial in e'', because, now, e'' extends e' and there is a binding of factorial in e'. And, further, by the time the function is called, e' will have been mutated so that the binding is the correct one: it's the function itself.

And this is in fact more-or-less how letrec is defined. In a form like

(letrec ((a <f1>)
         (b <f2>))
  ...)

First the variables, a and b are bound to some undefined values (it is an error ever to refer to these values). Then <f1> and <f2> are evaluated in some order, in the resulting environment (this evaluation should not refer to the values that a and b have at that point), and the results of these evaluations are assigned to a and b respectively, mutating their bindings and finally the body is evaluated in the resulting environment.

See for instance R6RS (other reports are similar but harder to refer to as most of them are PDF):

The <variable>s are bound to fresh locations, the <init>s are evaluated in the resulting environment (in some unspecified order), each <variable> is assigned to the result of the corresponding <init>, the <body> is evaluated in the resulting environment, and the values of the last expression in <body> are returned. Each binding of a <variable> has the entire letrec expression as its region, making it possible to define mutually recursive procedures.

This is obviously something similar to what define must do, and in fact I think it's clear that, for internal define at least, you can always turn define into letrec:

(define (outer a)
  (define (inner b)
    ...)
  ...)
-->
(define (outer a)
  (letrec ((inner (λ (b) ...)))
    ...))

And perhaps this is the same as

(letrec ((outer
          (λ (a)
            (letrec ((inner
                      (λ (b)
                        ...)))
              ...)))))

But I am not sure.


Of course, letrec buys you no computational power (neither does define): you can define recursive functions without it, it's just less convenient:

(let ((facter
       (λ (c n)
         (if (= n 1)
             1
             (* n (c c (- n 1)))))))
  (let ((factorial
         (λ (n)
           (facter facter n))))
    (factorial 10)))

or, for the pure of heart:

((λ (facter)
   ((λ (factorial)
      (factorial 10))
    (λ (n)
      (facter facter n))))
 (λ (c n)
   (if (= n 1)
       1
       (* n (c c (- n 1))))))

This is pretty close to the U combinator, or I always think it is.


Finally, it's reasonably easy to write a quick-and-dirty letrec as a macro. Here's one called labels (see the comments to this answer):

(define-syntax labels
  (syntax-rules ()
    [(_ ((var init) ...) form ...)
     (let ((var (λ x (error "bad doom"))) ...)
       (set! var init) ...
       form ...)]))

This will work for conforming uses, but it can't make referring to the initial bindings of the variables is an error: calling them is, but they can leak out. Racket, for instance, does some magic which makes this be an error.

2
DBridgham On

Let's start with my version of everyone's favorite mutually recursive example, even-or-odd.

(define (even-or-odd x)
  (letrec ((internal-even? (lambda (n)
                             (if (= n 0) 'even
                                 (internal-odd? (- n 1)))))
           (internal-odd? (lambda (n)
                            (if (= n 0) 'odd
                                (internal-even? (- n 1))))))
    (internal-even? x)))

If you wrote that with let instead of letrec, you'd get an error that internal-even? in unbound. The descriptive reason for why that is is that the expressions that define the initial values in a let are evaluated in a lexical environment before the variables are bound whereas letrec creates an environment with those variables first, just to make this work.

Now we'll have a look at how to implement let and letrec with lambda so you can see why this might be.

The implementation of let is fairly straightforward. The general form is something like this:

(let ((x value)) body)  -->  ((lambda (x) body) value)

And so even-or-odd written with a let would become:

(define (even-or-odd-let x)
  ((lambda (internal-even? internal-odd?)
     (internal-even? x))
   (lambda (n)
     (if (= n 0) 'even
         (internal-odd? (- n 1))))
   (lambda (n)
     (if (= n 0) 'odd
         (internal-even? (- n 1))))))

You can see that the bodies of internal-even? and internal-odd? are defined outside the scope of where those names are bound. It gets an error.

To deal with this problem when you want recursion to work, letrec does something like this:

(letrec (x value) body)  -->  ((lambda (x) (set! x value) body) #f)

[Note: There's probably a much better way of implementing letrec but that's what I'm coming up with off the top of my head. It'll give you the idea, anyway.]

And now even-or-odd? becomes:

(define (even-or-odd-letrec x)
  ((lambda (internal-even? internal-odd?)
     (set! internal-even? (lambda (n)
                (if (= n 0) 'even
                    (internal-odd? (- n 1)))))
     (set! internal-odd? (lambda (n)
               (if (= n 0) 'odd
                   (internal-even? (- n 1)))))
     (internal-even? x))
   #f #f))

Now internal-even? and internal-odd? are being used in a context where they've been bound and it all works.

12
Will Ness On

The following is even? without let or letrec:

(define even?
  ( (lambda (e o)              <------.      ------------.
      (lambda (n) (e n e o))     -----*                  |
      )                                                  |
    (lambda (n e o)                      <------.    <---+
       (if (= n 0) #t (o (- n 1) e o)))    -----*        |
    (lambda (n e o)                        <------.  <---*
       (if (= n 0) #f (e (- n 1) e o)))))    -----*

This defines the name even? to refer to the result of evaluating the application of the object returned by evaluating the (lambda (e o) (lambda (n) (e n e o)) ) expression to two objects produced by evaluating the other two lambda expressions, the ones in the arguments positions.

Each of the argument lambda expressions is well formed, in particular there are no references to undefined names. Each only refers to its arguments.

The following is the same even?, written with let for convenience:

(define even?-let-
  (let ((e  (lambda (n e o)                     <------.  <---.
              (if (= n 0) #t (o (- n 1) e o))))   -----*      |
        (o  (lambda (n e o)                     <------.  <---+
              (if (= n 0) #f (e (- n 1) e o))))   -----*      |
        )                                                     |
    (lambda (n) (e n e o)) ))     ----------------------------*

But what if we won't pass those e and o values around as arguments?

(define even?-let-wrong-                       ^  ^
  (let ((e  (lambda (n)      <-----------------|--|-------.
              (if (= n 0) #t (o (- n 1)))))  --*  |       |
        (o  (lambda (n)                           |       |
              (if (= n 0) #f (e (- n 1)))))     --*       |
        )                                                 |
    (lambda (n) (e n)) ))      ---------------------------*

What are the two names o and e inside the two lambda's if expressions refer to now?

They refer to nothing defined in this piece of code. They are "out of scope".

Why? It can be seen in the equivalent lambda-based expression, similar to what we've started with, above:

(define even?-wrong-                   ^   ^
  ( (lambda (e o)         <----.   ----|---|---------.
      (lambda (n) (e n))     --*       |   |         |
      )                                |   |         |
    (lambda (n)                        |   |  <------+
       (if (= n 0) #t (o (- n 1)))) ---*   |         |
    (lambda (n)                            |  <------*
       (if (= n 0) #f (e (- n 1)))))) -----*  

This defines the name even?-wrong- to refer to the result of evaluating the application of the object returned by evaluating the (lambda (e o) (lambda (n) (e n)) ) expression to two objects produced by evaluating the other two lambda expressions, the ones in the arguments positions.

But each of them contains a free variable, a name which refers to nothing defined in its scope. One contains an undefined o, and the other contains an undefined e.

Why? Because in the application (<A> <B> <C>), each of the three expressions <A>, <B>, and <C> is evaluated in the same scope -- that in which the application itself appears; the enclosing scope. And then the resulting values are applied to each other (in other words, the function call is made).

A "scope" is simply a textual area in a code.

Yet we need the o in the first argument lambda to refer to the second argument lambda, not anything else (or even worse, nothing at all). Same with the e in the second argument lambda, which we need to point at the first argument lambda.


let evaluates its variables' init expressions in the enclosing scope of the whole let expression first, and then it creates a fresh environment frame with its variables' names bound to the values which result from those evaluations. The same thing happens with the equivalent three-lambdas expression evaluation.

letrec, on the other hand, first creates the fresh environment frame with its variables' names bound to as yet-undefined-values (such that referring to those values is guaranteed to result in an error) and then it evaluates its init expressions in this new self-referential frame, and then it puts the resulting values into the bindings for their corresponding names.

Which makes the names inside the lambda expressions refer to the names inside the whole letrec expression's scope. In contrast to the let's referring to the outer scope:

             ^        ^
             |        |
(let ((e     |        |
        (... o ...))  |
      (o              |
        (............ e .........)))
  .....)

does not work;

                  .----------------.
                  |                |
(letrec ((e    <--|--------.       |
           (..... o ...))  |       |
         (o    <-----------|-------*
           (.............. e .........)))
  .....)

works fine.


Here's an example to further clarify things:

1. consider ((lambda (a b) ....here a is 1.... (set! a 3) ....here a is 3....) 1 2)

  1. now consider ((lambda (a b) .....) (lambda (x) (+ a x)) 2).

the two as are different -- the argument lambda is ill-defined.

  1. now consider ((lambda (a b) ...(set! a (lambda (x) (+ a x))) ...) 1 2).

the two as are now the same.

so now it works.