I'm picking apart some C++ Python wrapper code that allows the consumer to construct custom old style and new style Python classes from C++.
The original code comes from PyCXX, with old and new style classes here and here. I have however rewritten the code substantially, and in this question I will reference my own code, as it allows me to present the situation in the greatest clarity that I am able. I think there would be very few individuals capable of understanding the original code without several days of scrutiny... For me it has taken weeks and I'm still not clear on it.
The old style simply derives from PyObject,
template<typename FinalClass>
class ExtObj_old : public ExtObjBase<FinalClass>
   // ^ which : ExtObjBase_noTemplate : PyObject    
{
public:
    // forwarding function to mitigate awkwardness retrieving static method 
    // from base type that is incomplete due to templating
    static TypeObject& typeobject() { return ExtObjBase<FinalClass>::typeobject(); }
    static void one_time_setup()
    {
        typeobject().set_tp_dealloc( [](PyObject* t) { delete (FinalClass*)(t); } );
        typeobject().supportGetattr(); // every object must support getattr
        FinalClass::setup();
        typeobject().readyType();
    }
    // every object needs getattr implemented to support methods
    Object getattr( const char* name ) override { return getattr_methods(name); }
    // ^ MARKER1
protected:
    explicit ExtObj_old()
    {
        PyObject_Init( this, typeobject().type_object() ); // MARKER2
    }
When one_time_setup() is called, it forces (by accessing base class typeobject()) creation of the associated PyTypeObject for this new type.
Later when an instance is constructed, it uses PyObject_Init
So far so good.
But the new style class uses much more complicated machinery. I suspect this is related to the fact that new style classes allow derivation.
And this is my question, why is the new style class handling implemented in the way that it is? Why is it having to create this extra PythonClassInstance structure? Why can't it do things the same way the old-style class handling does? i.e. Just type convert from the PyObject base type? And seeing as it doesn't do that, does this mean it is making no use of its PyObject base type?
This is a huge question, and I will keep amending the post until I'm satisfied it represents the issue well. It isn't a good fit for SO's format, I'm sorry about that. However, some world-class engineers frequent this site (one of my previous questions was answered by the lead developer of GCC for example), and I value the opportunity to appeal to their expertise. So please don't be too hasty to vote to close.
The new style class's one-time setup looks like this:
template<typename FinalClass>
class ExtObj_new : public ExtObjBase<FinalClass>
{
private:
    PythonClassInstance* m_class_instance;
public:
    static void one_time_setup()
    {
        TypeObject& typeobject{ ExtObjBase<FinalClass>::typeobject() };
        // these three functions are listed below
        typeobject.set_tp_new(      extension_object_new );
        typeobject.set_tp_init(     extension_object_init );
        typeobject.set_tp_dealloc(  extension_object_deallocator );
        // this should be named supportInheritance, or supportUseAsBaseType
        // old style class does not allow this
        typeobject.supportClass(); // does: table->tp_flags |= Py_TPFLAGS_BASETYPE
        typeobject.supportGetattro(); // always support get and set attr
        typeobject.supportSetattro();
        FinalClass::setup();
        // add our methods to the extension type's method table
        { ... typeobject.set_methods( /* ... */); }
        typeobject.readyType();
    }
protected:
    explicit ExtObj_new( PythonClassInstance* self, Object& args, Object& kwds )
      : m_class_instance{self}
    { }
So the new style uses a custom PythonClassInstance structure:
struct PythonClassInstance
{
    PyObject_HEAD
    ExtObjBase_noTemplate* m_pycxx_object;
}
PyObject_HEAD, if I dig into Python's object.h, is just a macro for PyObject ob_base; -- no further complications, like #if #else. So I don't see why it can't simply be:
struct PythonClassInstance
{
    PyObject ob_base;
    ExtObjBase_noTemplate* m_pycxx_object;
}
or even:
struct PythonClassInstance : PyObject
{
    ExtObjBase_noTemplate* m_pycxx_object;
}
Anyway, it seems that its purpose is to tag a pointer onto the end of a PyObject. This will be because Python runtime will often trigger functions we have placed in its function table, and the first parameter will be the PyObject responsible for the call. So this allows us to retrieve the associated C++ object.
But we also need to do that for the old-style class.
Here is the function responsible for doing that:
ExtObjBase_noTemplate* getExtObjBase( PyObject* pyob )
{
    if( pyob->ob_type->tp_flags & Py_TPFLAGS_BASETYPE )
    {
        /* 
        New style class uses a PythonClassInstance to tag on an additional 
           pointer onto the end of the PyObject
        The old style class just seems to typecast the pointer back up
           to ExtObjBase_noTemplate
        ExtObjBase_noTemplate does indeed derive from PyObject
        So it should be possible to perform this typecast
        Which begs the question, why on earth does the new style class feel 
          the need to do something different?
        This looks like a really nice way to solve the problem
        */
        PythonClassInstance* instance = reinterpret_cast<PythonClassInstance*>(pyob);
        return instance->m_pycxx_object;
    }
    else
        return static_cast<ExtObjBase_noTemplate*>( pyob );
}
My comment articulates my confusion.
And here, for completeness is us inserting a lambda-trampoline into the PyTypeObject's function pointer table, so that Python runtime can trigger it:
table->tp_setattro = [] (PyObject* self, PyObject* name, PyObject* val) -> int
{
   try {
        ExtObjBase_noTemplate* p = getExtObjBase( self );
        return ( p -> setattro(Object{name}, Object{val}) ); 
    }
    catch( Py::Exception& ) { /* indicate error */
        return -1;
    }
};
(In this demonstration I'm using tp_setattro, note that there are about 30 other slots, which you can see if you look at the doc for PyTypeObject)
(in fact the major reason for working this way is that we can try{}catch{} around every trampoline. This saves the consumer from having to code repetitive error trapping.)
So, we pull out the "base type for the associated C++ object" and call its virtual setattro (just using setattro as an example here). A derived class will have overridden setattro, and this override will get called.
The old-style class provides such an override, which I've labelled MARKER1 -- it is in the top listing for this question.
The only the thing I can think of is that maybe different maintainers have used different techniques. But is there some more compelling reason why old and new style classes require different architecture?
PS for reference, I should include the following methods from new style class:
    static PyObject* extension_object_new( PyTypeObject* subtype, PyObject* args, PyObject* kwds )
    {
        PyObject* pyob = subtype->tp_alloc(subtype,0);
        PythonClassInstance* o = reinterpret_cast<PythonClassInstance *>( pyob );
        o->m_pycxx_object = nullptr;
        return pyob;
    }
^ to me, this looks absolutely wrong. It appears to be allocating memory, re-casting to some structure that might exceed the amount allocated, and then nulling right at the end of this. I'm surprised it hasn't caused any crashes. I can't see any indication anywhere in the source code that these 4 bytes are owned.
    static int extension_object_init( PyObject* _self, PyObject* _args, PyObject* _kwds )
    {
        try
        {
            Object args{_args};
            Object kwds{_kwds};
            PythonClassInstance* self{ reinterpret_cast<PythonClassInstance*>(_self) };
            if( self->m_pycxx_object )
                self->m_pycxx_object->reinit( args, kwds );
            else
                // NOTE: observe this is where we invoke the constructor, but indirectly (i.e. through final)
                self->m_pycxx_object = new FinalClass{ self, args, kwds };
        }
        catch( Exception & )
        {
            return -1;
        }
        return 0;
    }
^ note that there is no implementation for reinit, other than the default
virtual void    reinit ( Object& args  , Object& kwds    ) { 
    throw RuntimeError( "Must not call __init__ twice on this class" ); 
}
    static void extension_object_deallocator( PyObject* _self )
    {
        PythonClassInstance* self{ reinterpret_cast< PythonClassInstance* >(_self) };
        delete self->m_pycxx_object;
        _self->ob_type->tp_free( _self );
    }
EDIT: I will hazard a guess, thanks to insight from Yhg1s on the IRC channel.
Maybe it is because when you create a new old-style class, it is guaranteed it will overlap perfectly a PyObject structure.
Hence it is safe to derive from PyObject, and pass a pointer to the underlying PyObject into Python, which is what the old-style class does (MARKER2)
On the other hand, new style class creates a {PyObject + maybe something else} object. i.e. It wouldn't be safe to do the same trick, as Python runtime would end up writing past the end of the base class allocation (which is only a PyObject).
Because of this, we need to get Python to allocate for the class, and return us a pointer which we store.
Because we are now no longer making use of the PyObject base-class for this storage, we cannot use the convenient trick of typecasting back to retrieve the associated C++ object. Which means that we need to tag on an extra sizeof(void*) bytes to the end of the PyObject that actually does get allocated, and use this to point to our associated C++ object instance.
However, there is some contradiction here.
struct PythonClassInstance
{
    PyObject_HEAD
    ExtObjBase_noTemplate* m_pycxx_object;
}
^ if this is indeed the structure that accomplishes the above, then it is saying that the new style class instance is indeed fitting exactly over a PyObject, i.e. It is not overlapping into the m_pycxx_object.
And if this is the case, then surely this whole process is unnecessary.
EDIT: here are some links that are helping me learn the necessary ground work:
http://eli.thegreenplace.net/2012/04/16/python-object-creation-sequence
http://realmike.org/blog/2010/07/18/introduction-to-new-style-classes-in-python
Create an object using Python's C API
 
                        
PyCXX does allocate enough memory, but it does so by accident. This appears to be a bug in PyCXX.
The amount of memory Python allocates for the object is determined by the first call to the following static member function of
PythonClass<T>:The constructor of
PythonTypesets thetp_basicsizeof the python type object tosizeof(T). This way when Python allocates an object it knows to allocate at leastsizeof(T)bytes. It works becausesizeof(T)turns out to be larger thatsizeof(PythonClassInstance)(Tis derived fromPythonClass<T>which derives fromPythonExtensionBase, which is large enough).However, it misses the point. It should actually allocate only
sizeof(PythonClassInstance). This appears to be a bug in PyCXX - that it allocates too much, rather than too little space for storing aPythonClassInstanceobject.Here's my theory why new style classes are different from the old style classes in PyCXX.
Before Python 2.2, where new style classes were introduced, there was no
tp_initmember int the type object. Instead, you needed to write a factory function that would construct the object. This is howPythonExtension<T>is supposed to work - the factory function converts the Python arguments to C++ arguments, asks Python to allocate the memory and then calls the constructor using placement new.Python 2.2 added the new style classes and the
tp_initmember. Python first creates the object and then calls thetp_initmethod. Keeping the old way would have required that the objects would first have a dummy constructor that creates an "empty" object (e.g. initializes all members to null) and then whentp_initis called, would have had an additional initialization stage. This makes the code uglier.It seems that the author of PyCXX wanted to avoid that. PyCXX works by first creating a dummy
PythonClassInstanceobject and then whentp_initis called, creates the actualPythonClass<T>object using its constructor.This appears to be correct, the
PyObjectbase class does not seem to be used anywhere. All the interesting methods ofPythonExtensionBaseuse the virtualself()method, which returnsm_class_instanceand completely ignore thePyObjectbase class.I guess (only a guess, though) is that
PythonClass<T>was added to an existing system and it seemed easier to just derive fromPythonExtensionBaseinstead of cleaning up the code.