Find the elements of a LazySeq that have been realized

347 views Asked by At

I have a LazySeq of connections that are created when realized. If an exception occurs while attempting to create a connection, I'd like to iterate through all of the connections that have already been realized in the LazySeq and close them. Something like:

(try  
  (dorun connections)
  (catch ConnectException (close-connections connections)))

This doesn't quite work though since close-connections will attempt to realize the connections again. I only want to close connections that have been realized, not realize additional connections. Any ideas for doing this?

2

There are 2 answers

4
Michał Marczyk On BEST ANSWER

Code:

This returns the previously realized initial fragment of the input seq as a vector:

(defn take-realized [xs]
  (letfn [(lazy-seq? [xs]
            (instance? clojure.lang.LazySeq xs))]
    (loop [xs  xs
           out []]
      (if (or (and (lazy-seq? xs) (not (realized? xs)))
              (and (not (lazy-seq? xs)) (empty? xs)))
        out
        (recur (rest xs) (conj out (first xs)))))))

Testing at the REPL:

(defn lazy-printer [n]
  (lazy-seq
   (when-not (zero? n)
     (println n)
     (cons n (lazy-printer (dec n))))))

(take-realized (lazy-printer 10))
;= []

(take-realized (let [xs (lazy-printer 10)] (dorun (take 1 xs)) xs))
;=> 10
;= [10]

;; range returns a lazy seq...
(take-realized (range 20))
;= []

;; ...wrapping a chunked seq
(take-realized (seq (range 40)))
;= [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
;   17 18 19 20 21 22 23 24 25 26 27 28 29 30 31]

;; NB. *each* chunk of range gets its own LazySeq wrapper,
;; so that it is possible to work with infinite (or simply huge) ranges

(Using ;=> to indicate a printout.)

Discussion:

realized? is indeed the way to go, as suggested by Nathan. However, as I explained in my comments on Nathan's answer, one must also make sure that one doesn't inadvertently call seq on the one's input, as that would cause the previously-unrealized fragments of the input seq to become realized. That means that functions such as non-empty and empty? are out, since they are implemented in terms of seq.

(In fact, it is fundamentally impossible to tell whether a lazy seq is empty without realizing it.)

Also, while functions like lazify are useful for unchunking sequences, they do not prevent their underlying seqs from being realized in a chunked fashion; rather, they enable layers of processing (map, filter etc.) to operate in an unchunked fashion even while their original input seqs are chunked. There is in fact no connection at all between such "lazified" / "unchunked" seq being realized and its underlying, possibly chunked seq being realized. (In fact there is no way to establish such a connection in the presence of other observers of the input seq; absent other observers, it could be accomplished, but only at the cost of making lazify considerably more tedious to write.)

7
Nathan Davis On

Update: While this answer will work for the context presented in the original question (running doall over a sequence, and determine which ones were realize if there was an exception), it contains several flaws and is unsuitable for the general use suggested by the question title. It does, however, present a theoretical (but flawed) basis that might help in understanding Michał Marczyk's answer. If you are having trouble understanding that answer, this answer might help by breaking things down a little more. It also illustrates several pitfalls you might encounter. But otherwise, just ignore this answer.

LazySeq implements IPending, so theoretically this should be as easy as iterating over successive tail sequences until realized? returns false:

(defn successive-tails [s]
  (take-while not-empty
              (iterate rest s)))

(defn take-realized [s]
  (map first
       (take-while realized?
                   (successive-tails s))))

Now, if you truly have a 100% LazySeq from start to finish, that's it -- take-realized will return the items of s that have been realized.

Edit: Ok, not really. This will work for determining which items were realized before an exception was thrown. However, as Michal Marcyzk points out, it will cause every item in the sequence to be realized in other contexts.

You can then write your cleanup logic like this:

(try  
  (dorun connections) ; or doall
  (catch ConnectException (close-connections (take-realized connections))))

However, be aware that a lot of Clojure's "lazy" constructs are not 100% lazy. For example, range will return a LazySeq, but if you start resting down it, it turns into a ChunkedCons. Unfortunately, ChunkedCons does not implement IPending, and calling realized? on one will throw an exception. To work around this, we can use lazy-seq to explicitly build a LazySeq that will stay a LazySeq for any sequence:

(defn lazify [s]
  (if (empty? s)
    nil
    (lazy-seq (cons (first s) (lazify (rest s))))))

Edit: As Michał Marczyk pointed out in a comment, lazify does not guarantee the underlying sequence is lazily consumed. In fact, it will probably realize previously unrealized items (but appears to only throw an exception the first time through). Its sole purpose is to guarantee that calling rest results in either nil or a LazySeq. In other words, it works well enough to run the example below, but YMMV.

Now if we use the same "lazified" sequence in both the dorun and the cleanup code, we will be able to use take-realize. Here's an example that illustrates how to build an expression that will return a partial sequence (the part before the failure) if an exception occurs while realizing it:

(let [v (for [i (lazify (range 100))]
          (if (= i 10)
            (throw (new RuntimeException "Boo!"))
            i))]
  (try
    (doall v)
    (catch Exception _ (take-realized v))))