I type this code into the interpreter and memory is rapidly consumed:
last [1..10^7] `seq` ()
I can't see why this needs more than O(1) space. If i do just (which should be the same, because Show forces weak head normal form, so seq is redundant?):
last [1..10^7]
...it works fine.
I'm unable to reproduce this situation outside the interpreter.
What's going on here?
Here are some test cases: http://hpaste.org/69234
Things to note:
- By running in the interpreter, I load wtf.hs without compiling it, and type
wtf<n>
in ghci. - By compiling, I do
ghc --make wtf.hs && ./wtf
. last
can be substituted for asum
with strict accumulator or a function that finds the max element in the list, and the space leak still happens- I haven't seen this behaviour when using
$!
instead ofseq
. - I tried adding a dummy
()
parameter because I thought maybe this is a CAF problem. Changes nothing. - It's probably not a problem with functions on
Enum
, because I can reproduce the behaviour withwtf5
and later, which don't useEnum
at all. - It's probably not a problem with
Num
,Int
, orInteger
, because I can reproduce the behaviour without them inwtf14
andwtf16
.
I tried reproducing the problem with Peano arithmetic to take lists and integers out of the equation (fetching the at the end of of 10^9), but ran into other sharing/space leak problems I don't understand when trying to implement *
.
We need to look at the instance of
enumFromTo
for Integer, and last:It is defined in GHC.Enum as:
where
and
last
is fully lazy, as expected.For the Integer Enum class, we have a strict accumulator (explicitly) for
enumFrom
. In the bounded case (e.g.[1..n]
), it callsenumDeltaToInteger
and then intoup_list
, which uses a worker to unfold a list until its limit is reached.But
up_list
is strict inx
in thego
helper (see the comparison againstlim
).When run in GHCi none of this is optimized, yielding naive calls to enumFromTo, before returning
()
.So, why are we retaining the list in the
seq
case, and not in the regular case? The regular case runs nicely in constrant space, relying on the laziness ofenumFromTo
forInteger
and forlast
. The GHCi core for that case looks like:So these are almost identical, with the differences being:
seq
version,last (enumFromTo ..)
is forced inside acase
.let
.seq
version, the value is computed then discarded, yielding a()
-- nothing looks at the resultWhat is odd is that there's nothing magic about:
that makes it retain values.
As we see, the implementation of
up_list
is strict in its accumulator (since it compares againstlim
, and the list is unfolded lazily -- solast
should be able to consume it in constant space). Writing the expression by hand confirms this.Doing a heap profile of the ghci execution shows the entire list being retained:
which tells us at least that it isn't a chain of thunks, but rather, the entire list is being built strictly and held on to, until being discarded.
So the mystery is: what is holding onto the list argument to
last
in ghci, and not in ghc?I suspect some internal (or subtle) detail of ghci now -- I think this is worth a ghci ticket.