I am reading The Seasoned Schemer by Friedman and Felleisen, but I am a little uneasy with some of their best practices. In particular, the authors recommend:
- using
letrec
to remove arguments that do not change for recursive applications; - using
letrec
to hide and to protect functions; - using
letcc
to return values abruptly and promptly.
Let's examine some consequences of these rules. Consider for example this code for computing the intersection of a list of lists:
#lang scheme
(define intersectall
(lambda (lset)
(let/cc hop
(letrec
[[A (lambda (lset)
(cond [(null? (car lset)) (hop '())]
[(null? (cdr lset)) (car lset)]
[else (I (car lset) (A (cdr lset)))]))]
[I (lambda (s1 s2)
(letrec
[[J (lambda (s1)
(cond [(null? s1) '()]
[(M? (car s1) s2) (cons (car s1) (J (cdr s1)))]
[else (J (cdr s1))]))]
[M? (lambda (el s)
(letrec
[[N? (lambda (s)
(cond [(null? s) #f]
[else (or (eq? (car s) el) (N? (cdr s)))]))]]
(N? s)))]]
(cond [(null? s2) (hop '())]
[else (J s1)])))]]
(cond [(null? lset) '()]
[else (A lset)])))))
This example appears in Chapter 13 (not exactly like this: I glued the membership testing code that is defined separately in a previous paragraph).
I think the following alternative implementation, which makes very limited use of letrec
and letcc
is much more readable and simpler to understand:
(define intersectall-naive
(lambda (lset)
(letrec
[[IA (lambda (lset)
(cond [(null? (car lset)) '()]
[(null? (cdr lset)) (car lset)]
[else (intersect (car lset) (IA (cdr lset)))]))]
[intersect (lambda (s1 s2)
(cond [(null? s1) '()]
[(M? (car s1) s2) (cons (car s1) (intersect (cdr s1) s2))]
[else (intersect (cdr s1) s2)]))]
[M? (lambda (el s)
(cond [(null? s) #f]
[else (or (eq? (car s) el) (M? el (cdr s)))]))]]
(cond [(null? lset) '()]
[else (IA lset)]))))
I am new to scheme and my background is not in Computer Science, but it strikes me that we have to end up with such complex code for a simple list intersection problem. It makes me wonder how people manage the complexity of real-world applications.
Are experienced schemers spending their days deeply nesting letcc
and letrec
expressions?
This was the motivation for asking stackexchange.
My question is: are Friedman and Felleisen overly complicating this example for education's sake, or should I just get accustomed to scheme code full of letcc
s and letrec
s for performance reasons?
Is my naive code going to be impractically slow for large lists?
I'm not an expert on Scheme implementations, but I have some ideas about what's going on here. One advantage the authors have via their
let/cc
that you don't have is early termination when it's clear what the entire result will be. Suppose someone evaluatesYour
IA
will transform this towhich is reasonable enough. The innermost intersection will be computed first, and since
gigantic-list
is not nil, it will traverse the entirety ofgigantic-list
, for each item checking whether that item is a member of'()
. None are, of course, so this results in'()
, but you did have to walk the entire input to find that out. This process will repeat at each nestedintersect
call: your inner procedures have no way to signal "It's hopeless, just give up", because they communicate only by their return value.Of course you can solve this without
let/cc
, by checking the return value of eachintersect
call for nullness before continuing. But (a) it is rather pretty to have this check only occur in one direction instead of both, and (b) not all problems will be so amenable: maybe you want to return something where you can't signal so easily that early exit is desired. Thelet/cc
approach is general, and allows early-exit in any context.As for using
letrec
to avoid repetition of constant arguments to recursive calls: again, I am not an expert on Scheme implementations, but in Haskell I have heard the guidance that if you are closing over only 1 parameter it's a wash, and for 2+ parameters it improves performance. This makes sense to me given how closures are stored. But I doubt it is "critical" in any sense unless you have a large number of arguments or your recursive functions do very little work: argument handling will be a small portion of the work done. I would not be surprised to find that the authors think this improves clarity, rather than doing it for performance reasons. If I seeI will be rather less happy than if I see
because I have to discover that actually
p
is just another name forx
, etc., check that the samex
,y
, andz
are used in each case, and confirm that this is on purpose. In the latter case, it is obvious thatx
continues to have that meaning throughout, because there is no other variable holding that value. Of course this is a simplified example and I would not be thrilled to see four literal applications ofg
regardless, but the same idea holds for recursive functions.