collections.Counter
is extremely useful to gather counts and operate on counts as objects. You can do things like:
>>> Counter({'a': 2, 'b': 5}) + Counter({'a': 3, 'c': 7})
Counter({'c': 7, 'a': 5, 'b': 5})
This essentially groups items by their keys and sums the values of each group.
What is a minimum-code way to reuse the Counter
functionality with non-integer values?
These values would have addition defined to be the group values "reducing" operation I want already: For example strings and lists (which both have __add__
defined to be concatenation).
As is, we get:
>>> Counter({'a': 'hello ', 'b': 'B'}) + Counter({'a': 'world', 'c': 'C'})
Traceback (most recent call last):
...
TypeError: '>' not supported between instances of 'str' and 'int'
>>> Counter({'a': [1, 2], 'b': [3, 4]}) + Counter({'a': [1, 1], 'c': [5, 6, 7]})
Traceback (most recent call last):
...
TypeError: '>' not supported between instances of 'list' and 'int'
In the code of collections.Counter
, there is a hard-coded assumption that values are integers, and therefore things like self.get(k, 0)
and count > 0
littered all over the place. It seems therefore that subclassing Counter
wouldn't be much less work than rewriting my own specialized (or general) custom class (probably using collections.defaultdict
).
Instead, it seems like wrapping the values (e.g. str
and list
) to be able to operate with 0 as if it were an empty element might be the elegant approach.
I'd propose two solutions:
One wrapping the values themselves, though only the examples in the question are assured to covered here: Other
Counter
operations are not:The other by writing a "Counter-like" custom class, subclassing
defaultdict
, and again, only implementing the__add__
method.