# Thinking in Lazy Sequences

Asked by At

Taking an example of Fibonacci Series from the Clojure Wiki, the Clojure code is :

``````(def fib-seq
(lazy-cat [0 1] (map + (rest fib-seq) fib-seq)))
``````

If you were to think about this starting from the [0 1], how does it work ? Would be great if there are suggestions on the thought process that goes into thinking in these terms.

## 3 Answers On Best Solutions

As you noted, the `[0 1]` establishes the base cases: The first two values in the sequence are zero, then one. After that, each value is to be the sum of the previous value and the value before that one. Hence, we can't even compute the third value in the sequence without having at least two that come before it. That's why we need two values with which to start off.

Now look at the `map` form. It says to take the head items from two different sequences, combine them with the `+` function (adding multiple values to produce one sum), and expose the result as the next value in a sequence. The `map` form is zipping together two sequences — presumably of equal length — into one sequence of the same length.

The two sequences fed to `map` are different views of the same basic sequence, shifted by one element. The first sequence is "all but the first value of the base sequence". The second sequence is the base sequence itself, which, of course, includes the first value. But what should the base sequence be?

The definition above said that each new element is the sum of the previous (Z - 1) and the predecessor to the previous element (Z - 2). That means that extending the sequence of values requires access to the previously computed values in the same sequence. We definitely need a two-element shift register, but we can also request access to our previous results instead. That's what the recursive reference to the sequence called `fib-seq` does here. The symbol `fib-seq` refers to a sequence that's a concatenation of zero, one, and then the sum of its own Z - 2 and Z - 1 values.

Taking the sequence called `fib-seq`, drawing the first item yields the first element of the `[0 1]` vector — zero. Drawing the second item yields the second element of the vector — one. Upon drawing the third item, we consult the `map` to generate a sequence and use that as the remaining values. The sequence generated by `map` here starts out with the sum of the first item of "the rest of" `[0 1]`, which is one, and the first item of `[0 1]`, which is zero. That sum is one.

Drawing the fourth item consults `map` again, which now must compute the sum of the second item of "the rest of" the base sequence, which is the one generated by `map`, and the second item of the base sequence, which is the one from the vector `[0 1]`. That sum is two.

Drawing the fifth item consults `map`, summing the third item of "the rest of" the base sequence — again, the one resulting from summing zero and one — and the third item of the base sequence — which we just found to be two.

You can see how this is building up to match the intended definition for the series. What's harder to see is whether drawing each item is recomputing all the preceding values twice — once for each sequence examined by `map`. It turns out there's no such repetition here.

To confirm this, augment the definition of `fib-seq` like this to instrument the use of function `+`:

``````(def fib-seq
(lazy-cat [0 1]
(map
(fn [a b]
(println (format "Adding %d and %d." a b))
(+ a b))
(rest fib-seq) fib-seq)))
``````

Now ask for the first ten items:

``````> (doall (take 10 fib-seq))
Adding 1 and 0.
Adding 1 and 1.
Adding 2 and 1.
Adding 3 and 2.
Adding 5 and 3.
Adding 8 and 5.
Adding 13 and 8.
Adding 21 and 13.
(0 1 1 2 3 5 8 13 21 34)
``````

Notice that there are eight calls to `+` to generate the first ten values.

Since writing the preceding discussion, I've spent some time studying the implementation of lazy sequences in Clojure — in particular, the file LazySeq.java — and thought this would be a good place to share a few observations.

First, note that many of the lazy sequence processing functions in Clojure eventually use `lazy-seq` over some other collection. `lazy-seq` creates an instance of the Java type `LazySeq`, which models a small state machine. It has several constructors that allow it to start in different states, but the most interesting case is the one that starts with just a reference to a nullary function. Constructed that way, the `LazySeq` has neither evaluated the function nor found a delegate sequence (type `ISeq` in Java). The first time one asks the `LazySeq` for its first element — via `first` — or any successors — via `next` or `rest` — it evaluates the function, digs down through the resulting object to peel away any wrapping layers of other `LazySeq` instances, and finally feeds the innermost object through the java function `RT#seq()`, which results in an `ISeq` instance.

At this point, the `LazySeq` has an `ISeq` to which to delegate calls on behalf of `first`, `next`, and `rest`. Usually the "head" `ISeq` will be of type `Cons`, which stores a constant value in its "first" (or "car") slot and another `ISeq` in its "rest" (or "cdr") slot. That `ISeq` in its "rest" slot can in turn be a `LazySeq`, in which case accessing it will again require this same evaluation of a function, peeling away any lazy wrappers on the return value, and passing that value through `RT#seq()` to yield another `ISeq` to which to delegate.

The `LazySeq` instances remain linked together, but having forced one (through `first`, `next`, or `rest`) causes it to delegate straight through to some non-lazy `ISeq` thereafter. Usually that forcing evaluates a function that yields a `Cons` bound to first value and its tail bound to another `LazySeq`; it's a chain of generator functions that each yield one value (the `Cons`'s "first" slot) linked to another opportunity to yield more values (a `LazySeq` in the `Cons`'s "rest" slot).

Tying this back, in the Fibonacci Sequence example above, `map` will take each of the nested references to to `fib-seq` and walk them separately via repeated calls to `rest`. Each such call will transform at most one `LazySeq` holding an unevaluated function into a `LazySeq` pointing to something like a `Cons`. Once transformed, any subsequent accesses will quickly resolve to the `Cons`es — where the actual values are stored. When one branch of the `map` zipping walks `fib-seq` one element behind the other, the values have already been resolved and are available for constant-time access, with no further evaluation of the generator function required.

Here are some diagrams to help visualize this interpretation of the code:

``````        +---------+
| LazySeq |
fib-seq |  fn -------> (fn ...)
|  sv     |
|  s      |
+---------+

+---------+
| LazySeq |
fib-seq |  fn -------> (fn ...) -+
|  sv <------------------+
|  s      |
+---------+

+---------+
| LazySeq |
fib-seq |  fn     |
|  sv -------> RT#seq() -+
|  s  <------------------+
+---------+

+---------+   +------+
| LazySeq |   | ISeq |
fib-seq |  fn     |   |      |
|  sv     |   |      |
|  s  ------->|      |
+---------+   +------+

+---------+   +--------+        +------+
| LazySeq |   | Cons   |        | ISeq |
fib-seq |  fn     |   |  first ---> 1   |      |
|  sv     |   |  more  -------->|      |
|  s  ------->|        |        |      |
+---------+   +--------+        +------+

+---------+   +--------+        +---------+
| LazySeq |   | Cons   |        | LazySeq |
fib-seq |  fn     |   |  first ---> 1   |  fn -------> (fn ...)
|  sv     |   |  more  -------->|  sv     |
|  s  ------->|        |        |  s      |
+---------+   +--------+        +---------+
``````

As `map` progresses, it hops from `LazySeq` to `LazySeq` (and hence `Cons` to `Cons`), and the rightmost edge only expands the first time one calls `first`, `next`, or `rest` on a given `LazySeq`. On

As for how this works:

Each term of the fibonacci series is obviously the result of adding the previous two terms. That's what map is doing here, map applies + to each element in each sequence until one of the sequences runs out (which they won't in this case, of course). So the result is a sequence of numbers that result from adding one term in the sequence to the next term in the sequence. Then you need lazy-cat to give it a starting point and make sure the function only returns what it's asked for.

The problem with this implementation is that fib-seq is holding onto the whole sequence for as long as the fib-seq is defined, so it will eventually run you out of memory.

Stuart Halloway's book spends some time on dissecting different implementations of this function, I think the most interesting one is below (it's Christophe Grande's):

``````(defn fibo []
(map first (iterate (fn [[a b]] [b (+ a b)]) [0 1])))
``````

Unlike the posted implementation previously read elements of the sequence have nothing holding onto them so this one can keep running without generating an OutOfMemoryError.

How to get thinking in these terms is a harder question. So far for me it's a matter of getting acquainted with a lot of different ways of doing things and trying them out, while in general looking for ways to apply the existing function library in preference to using recursion and lazy-cat. But in some cases the recursive solution is really great, so it depends on the problem. I'm looking forward to getting the Joy of Clojure book, because I think it will help me a lot with this issue. On

My Clojure is a bit rusty, but this seems to be a literal translation of the famous Haskell one-liner:

``````fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
``````

[I'm going to be using pseudo-Haskell, because it's a little bit more succinct.]

The first thing you need to do, is simply let laziness sink in. When you look at a definition like this:

``````zeroes = 0 : zeroes
``````

Your immediate gut reaction as a strict programmer would be "ZOMG infinite loop! Must fix bug!" But it isn't an infinite loop. This is a lazy infinite loop. If you do something stupid like

``````print zeroes
``````

Then, yes, there will be an infinite loop. But as long as you simply use a finite number of elements, you will never notice that the recursion doesn't actually terminate. This is really hard to get. I still don't.

Laziness is like the monetary system: it's based on the assumption that the vast majority of people never use the vast majority of their money. So, when you put \$1000 in the bank, they don't keep it in their safe. They lend it to someone else. Actually, they leverage the money, which means that they lend \$5000 to someone else. They only ever need enough money so that they can quickly reshuffle it so that it's there when you are looking at it, giving you the appearance that they actually keep your money.

As long as they can manage to always give out money when you walk up to an ATM, it doesn't actually matter that the vast majority of your money isn't there: they only need the small amount you are withdrawing at the point in time when you are making your withdrawal.

Laziness works the same: whenever you look at it, the value is there. If you look at the first, tenth, hundreth, quadrillionth element of `zeroes`, it will be there. But it will only be there, if and when you look at it, not before.

That's why this inifintely recursive definition of `zeroes` works: as long as you don't try to look at the last element (or every element) of an infinite list, you are safe.

Next step is `zipWith`. (Clojure's `map` is just a generalization of what in other programming languages are usually three different functions: `map`, `zip` and `zipWith`. In this example, it is used as `zipWith`.)

The reason why the `zip` family of functions is named that way, is because it actually works like a zipper, and that is also how to best visualize it. Say we have some sporting event, where every contestant gets two tries, and the score from both tries is added up to give the end result. If we have two sequences, `run1` and `run2` with the scores from each run, we can calculate the end result like so:

``````res = zipWith (+) run1 run2
``````

Assuming our two lists are `(3 1 6 8 6)` and `(4 6 7 1 3)`, we line both of those lists up side by side, like the two halves of a zipper, and then we zip them together using our given function (`+` in this case) to yield a new sequence:

``````3   1   6   8   6
+   +   +   +   +
4   6   7   1   3
=   =   =   =   =
7   7  13   9   9
``````

Contestant #3 wins.

So, what does our `fib` look like?

Well, it starts out with a `0`, then we append a `1`, then we append the sum of the infinite list with the infinite list shifted by one element. It's easiest to just draw that out:

• the first element is zero:

``````0
``````
• the second element is one:

``````0   1
``````
• the third element is the first element plus the first element of the rest (i.e. the second element). We visualize this again like a zipper, by putting the two lists on top of each other.

``````0   1
+
1
=
1
``````
• Now, the element that we just computed is not just the output of the `zipWith` function, it is at the same time also the input, as it gets appended to both lists (which are actually the same list, just shifted by one):

``````0   1   1
+   +
1   1
=   =
1   2
``````
• and so forth:

``````0   1   1   2
+   +   +
1   1   2
=   =   =
1   2   3

0   1   1   2   3   ^
+   +   +   +       |
1   1   2   3   ^   |
=   =   =   =   |   |
1   2   3   5---+---+
``````

Or if you draw it a little bit differently and merge the result list and the second input list (which really are the same, anyway) into one:

``````0   1   1   2   3   ^
+   +   +   +   +   |
1 = 1 = 2 = 3 = 5---+
``````

That's how I visualize it, anyway.