How to count the occurences of every element in a list in Haskell fast?

96 views Asked by At

I have list of strings that represent classes of classified objects.

["Class 1", "Class 2", "Class 1", "Class 2", "Class 3"] 

would yield [2,2,1] (their order does not matter).

I have tried two functions that do this, both of them are too slow for large lists (thousands of items).

This is the first one that does slighly better than the second one:

uniqueClassesCounts :: [String] -> [Int]
uniqueClassesCounts classNames=
  let uniqueClasses = nub classNames
  in [length (filter (== cls) classNames) | cls <- uniqueClasses]

And the slower one:

uniqueClassesCounts :: [String] -> [Int]
uniqueClassesCounts classNames= map length (group (sort (classNames)))

I am pretty sure there must be a way to do this in linear time (my brief researched showed that the above functions both work in quadratic time and worse).

How can I make this thing faster? It is the unexpected bottleneck of my code, more than 70% of time is spent on it.

1

There are 1 answers

2
willeM_ Van Onsem On

It is probably better to use a HashMap or some other collection that can make lookups efficient.

You could do this with fromListWith :: (Eq k, Hashable k) => (v -> v -> v) -> [(k, v)] -> HashMap k v:

{-# LANGUAGE TupleSections #-}

import Data.HashMap.Strict(fromListWith)

uniqueClassesCounts :: (Eq a, Hashable a) => [a] -> HashMap a Int
uniqueClassesCounts = fromListWith (+) . map (,1)

If you are only interested in the counts, we can use elems :: HashMap k v -> [v]:

{-# LANGUAGE TupleSections #-}

import Data.HashMap.Strict(elems, fromListWith)

uniqueClassesCounts :: (Eq a, Hashable a) => [a] -> [Int]
uniqueClassesCounts = elems . fromListWith (+) . map (,1)