Subclassing builtin types in Python 2 and Python 3

1.2k views Asked by At

When subclassing builtin types, I noticed a rather important difference between Python 2 and Python 3 in the return type of the methods of the built-in types. The following code illustrates this for sets:

class MySet(set):

    pass

s1 = MySet([1, 2, 3, 4, 5])

s2 = MySet([1, 2, 3, 6, 7])

print(type(s1.union(s2)))

print(type(s1.intersection(s2)))

print(type(s1.difference(s2)))

With Python 2, all the return values are of type MySet. With Python 3, the return types are set. I could not find any documentation on what the result is supposed to be, nor any documentation about the change in Python 3.

Anyway, what I really care about is this: is there a simple way in Python 3 to get the behavior seen in Python 2, without redefining every single method of the built-in types?

3

There are 3 answers

2
Sven Marnach On

This isn't a general change for built-in types when moving from Python 2.x to 3.x -- list and int, for example, have the same behaviour in 2.x and 3.x. Only the set type was changed to bring it in line with the other types, as discussed in this bug tracker issue.

I'm afraid there is no really nice way to make it behave the old way. Here is some code I was able to come up with:

class MySet(set):
    def copy(self):
        return MySet(self)
    def _make_binary_op(in_place_method):
        def bin_op(self, other):
            new = self.copy()
            in_place_method(new, other)
            return new
        return bin_op
    __rand__ = __and__ = _make_binary_op(set.__iand__)
    intersection = _make_binary_op(set.intersection_update)
    __ror__ = __or__ = _make_binary_op(set.__ior__)
    union = _make_binary_op(set.update)
    __sub__ = _make_binary_op(set.__isub__)
    difference = _make_binary_op(set.difference_update)
    __rxor__ = xor__ = _make_binary_op(set.__ixor__)
    symmetric_difference = _make_binary_op(set.symmetric_difference_update)
    del _make_binary_op
    def __rsub__(self, other):
        new = MySet(other)
        new -= self
        return new

This will simply overwrite all methods with versions that return your own type. (There is a whole lot of methods!)

Maybe for your application, you can get away with overwriting copy() and stick to the in-place methods.

4
Ethan Furman On

Perhaps a metaclass to do all that humdrum wrapping for you would make it easier:

class Perpetuate(type):
    def __new__(metacls, cls_name, cls_bases, cls_dict):
        if len(cls_bases) > 1:
            raise TypeError("multiple bases not allowed")
        result_class = type.__new__(metacls, cls_name, cls_bases, cls_dict)
        base_class = cls_bases[0]
        known_attr = set()
        for attr in cls_dict.keys():
            known_attr.add(attr)
        for attr in base_class.__dict__.keys():
            if attr in ('__new__'):
                continue
            code = getattr(base_class, attr)
            if callable(code) and attr not in known_attr:
                setattr(result_class, attr, metacls._wrap(base_class, code))
            elif attr not in known_attr:
                setattr(result_class, attr, code)
        return result_class
    @staticmethod
    def _wrap(base, code):
        def wrapper(*args, **kwargs):
            if args:
                cls = args[0]
            result = code(*args, **kwargs)
            if type(result) == base:
                return cls.__class__(result)
            elif isinstance(result, (tuple, list, set)):
                new_result = []
                for partial in result:
                    if type(partial) == base:
                        new_result.append(cls.__class__(partial))
                    else:
                        new_result.append(partial)
                result = result.__class__(new_result)
            elif isinstance(result, dict):
                for key in result:
                    value = result[key]
                    if type(value) == base:
                        result[key] = cls.__class__(value)
            return result
        wrapper.__name__ = code.__name__
        wrapper.__doc__ = code.__doc__
        return wrapper

class MySet(set, metaclass=Perpetuate):
    pass

s1 = MySet([1, 2, 3, 4, 5])

s2 = MySet([1, 2, 3, 6, 7])

print(s1.union(s2))
print(type(s1.union(s2)))

print(s1.intersection(s2))
print(type(s1.intersection(s2)))

print(s1.difference(s2))
print(type(s1.difference(s2)))
0
khinsen On

As a follow-up to Sven's answer, here is a universal wrapping solution that takes care of all non-special methods. The idea is to catch the first lookup coming from a method call, and install a wrapper method that does the type conversion. At subsequent lookups, the wrapper is returned directly.

Caveats:

1) This is more magic trickery than I like to have in my code.

2) I'd still need to wrap special methods (__and__ etc.) manually because their lookup bypasses __getattribute__

import types

class MySet(set):

    def __getattribute__(self, name):
        attr = super(MySet, self).__getattribute__(name)
        if isinstance(attr, types.BuiltinMethodType):
            def wrapper(self, *args, **kwargs):
                result = attr(self, *args, **kwargs)
                if isinstance(result, set):
                    return MySet(result)
                else:
                    return result
            setattr(MySet, name, wrapper)
            return wrapper
        return attr