Performance related to "imperative" algorithms in haskell

512 views Asked by At

I have some training in the lisp family of languages and I am now learning a bit of Haskell for my own good. In lisp, functional style is ok but there are a few cases where imperative style is necessary to get decent performance, e.g.

  • append

    Append is slow since it must copy its first argument (sometimes x100 as slow as versions that succeed to get rid of it). A remedy is to move the last pointer of the first list to the beginning of the second list, instead of appending. Of course this is a destructive operation.

  • sort

    Functional versions of quicksort create many intermediate lists, which somehow defeats the purpose of the algorithm, which is to be as quick as possible. AFAIR, in common lisp, sort is the only destructive function that does not have a functional version.

  • length

    This is a costly operation in lisp since one must go down the whole list to get its length. This needs not be so, afaik clojure computes length of list in logarithmic time. The solution is often to compute the length on the fly in an imperative loop.

My question is, how do we deal with these problems in haskell ?

2

There are 2 answers

2
Thomas M. DuBuisson On BEST ANSWER

As delnan said, these problems are with using the wrong data structure, such as a linked list when you want a vector.

Your particular operations

  • append: cons is O(1), so I suspect you are referring to the act of allocating a cons cell, pattern matching on the cell, and eventual GC of the cell. This is rather unavoidable unless you optimize away the list, which does happen in certain situations.

  • sort: I suggest you look at the ST monad, which allows you to use algorithms that require mutation inside a pure context. See the vsort implementation for an example.

  • length: Sure, the only way to get the length of a linked list is to traverse the list. If you need the length often then consider using a Set, Map, or Vector - all of which record a total size that can be accessed in O(1).

In General

  • Look into ST for mutability.
  • Use the right data structure for the right problem. Learn what structures Hackage has to offer (vectors, arrays, hashmaps, hashtables, bloomfilters, and more).
0
J. Abrahamson On

append isn't a universal thing. Appending on to a double-ended queue is different from appending on to a cons list. Appending destructively is different from copy-on-append. Depending on your criteria, you may optimize for slowness or thread safety or simplicity and your definition of "problem" will change.

As delnan and Thomas DuBuisson mention Haskell has all of these options if you pick the right data type.

So if you have a specific algorithm you'd like to optimize, there's likely to be either a method to translate it to a fast non-destructive version, an option to simulate destructions at usually a multiplicative log(n) slowdown, or an option to run referentially transparent destructions at an additive constant slowdown.

For good examples of trouble with this translation look at algorithms for persistent union-find or the Functional Graph Library.