I'm having difficulties in understanding Haskell lazy evaluation.
I wrote simple test program. It reads 4 lines of data and second and fourth input line has lots of numbers.
consumeList :: [Int] -> [Int] -> [Int]
consumeList [] _ = error "hi" -- to generate heap debug
consumeList (x:xs) y = consumeList xs y
main = do
inputdata <- getContents
let (x:y:z:k:xs) = lines inputdata
s = map (read ::String->Int) $ words $ k
t = []
print $ consumeList s t
words
and map
is performed
on streams of characters lazily, this program use constant memory.
But As I add arguments t
, situation is changed.
My expectation is since t
is map
and words
on lazy stream,
and t
is not used in consumeList
, this change should not alter
memory usage. But no.
consumeList :: [Int] -> [Int] -> [Int]
consumeList [] _ = error "hi" -- to generate heap debug
consumeList (x:xs) y = consumeList xs y
main = do
inputdata <- getContents
let (x:y:z:k:xs) = lines inputdata
s = map (read ::String->Int) $ words $ k
t = map (read ::String->Int) $ words $ y
print $ consumeList s t -- <-- t is not used
Q1) Why does this program keeps allocating memory when t
is not used at all?
I have another question. When I pattern match lazy stream with [,]
, not
with (:)
memory allocation behaviour is changed.
consumeList :: [Int] -> [Int] -> [Int]
consumeList [] _ = error "hi" -- to generate heap debug
consumeList (x:xs) y = consumeList xs y
main = do
inputdata <- getContents
let [x,y,z,k] = lines inputdata -- <---- changed from (x:y:..)
s = map (read ::String->Int) $ words $ k
t = []
print $ consumeList s t
Q2) are (:)
and [,]
different in terms of lazy evalutation?
Any comments are welcomed. Thanks
[EDIT]
Q3) Then, is it possible to process 4th line first and the process 2nd line, without increasing memory consumption?
The experiment guided by Derek is as follows. With switching y and k from second example, I got same result:
consumeList :: [Int] -> [Int] -> [Int]
consumeList [] _ = error "hi"
consumeList (x:xs) y = consumeList xs y
main = do
inputdata <- getContents
let (x:y:z:k:xs) = lines inputdata
s = map (read ::String->Int) $ words $ y -- <- swap with k
t = map (read ::String->Int) $ words $ k -- <- swap with y
print $ consumeList s t
To answer your first question,
t
is live as far as the garbage collector is concerned until you reach the end of theconsumeList
. That wouldn't be such a big deal sincet
would be a thunk pointing at work to do, but the problem here is that thunk is now keepingy
alive andgetContents
has to actually read iny
to get tok
. In your first example,y
could be garbage collected as it was being read in. (As an experiment, if you switchedy
andk
in this example, I suspect you'd see behavior very much like your first example.)For your second question,
let [x,y,z,k] = ...
means "(irrefutably) match a list of exactly four elements". That means when you forcek
it needs to (at that point) check that there are no further elements, which means it needs to read in all of the input corresponding tok
before it can start processing it. In the earlier case,let (x:y:z:k:xs) = ...
it can start processingk
immediately because it doesn't have to first check whetherxs
is[]
or(_:_)
.