collections.Counter is extremely useful to gather counts and operate on counts as objects. You can do things like:
>>> Counter({'a': 2, 'b': 5}) + Counter({'a': 3, 'c': 7})
Counter({'c': 7, 'a': 5, 'b': 5})
This essentially groups items by their keys and sums the values of each group.
What is a minimum-code way to reuse the Counter functionality with non-integer values?
These values would have addition defined to be the group values "reducing" operation I want already: For example strings and lists (which both have __add__ defined to be concatenation).
As is, we get:
>>> Counter({'a': 'hello ', 'b': 'B'}) + Counter({'a': 'world', 'c': 'C'})
Traceback (most recent call last):
...
TypeError: '>' not supported between instances of 'str' and 'int'
>>> Counter({'a': [1, 2], 'b': [3, 4]}) + Counter({'a': [1, 1], 'c': [5, 6, 7]})
Traceback (most recent call last):
...
TypeError: '>' not supported between instances of 'list' and 'int'
In the code of collections.Counter, there is a hard-coded assumption that values are integers, and therefore things like self.get(k, 0) and count > 0 littered all over the place. It seems therefore that subclassing Counter wouldn't be much less work than rewriting my own specialized (or general) custom class (probably using collections.defaultdict).
Instead, it seems like wrapping the values (e.g. str and list) to be able to operate with 0 as if it were an empty element might be the elegant approach.
I wouldn't call "defining addition to integers for collection types" more elegant than just rewriting
Counterto do the work you need. You need non-counting related behaviors, you need a class that isn't focused on counting things.Fundamentally,
Counteris inappropriate to your use case; you don't have counts. What doeselementsmean when you lack a count to multiply each key by?most_commonmight work as written, but it would have nothing to do with frequencies.In 95% of cases, I'd just use
collections.defaultdict(list)(or whatever default is appropriate), and in the other 5%, I'd useCounteras a model and implement my own version (without count-specific behaviors).