Scala Stream.grouped buffers entire stream into memory

362 views Asked by At

Calling grouped on a Scala Stream seems to buffer the entire stream into memory. I've dug in quite a bit here to determine which class is holding the reference to Stream's head.

A simple example:

lazy val stream1: Stream[Int] = {
  def loop(v: Int): Stream[Int] = v #:: loop(v + 1)
  loop(0)
}

stream1.take(1000).grouped(10).foreach(println)

If one runs this code and places a breakpoint within the foreach function, one can see that there is a reference being held to the Stream's head as it's drawn out.

After several iterations, there are still references to earlier "chunks" of the Stream in memory: Stream Cons in memory after a few iteration

Additionally, if we inspect the reference to the head of the Stream, we can see that some lambda within IterableLike is holding a reference.

enter image description here

When grouped is called on the Stream, the Collections library first calls iterator on the Stream, returning a StreamIterator and then grouped on that iterator, returning a GroupedIterator. The screenshots above suggest that something within GroupedIterator seems to be holding onto the head of the Stream, but I cannot determine what.

My question is twofold: 1. Is this expected behavior with Scala Streams? If not, what is happened within the implementation of StreamIterator and GroupedIterator to cause the head of a Stream to be held onto while running .grouped(N) on a Stream?

0

There are 0 answers