I've nearly finished rewriting a C++ Python wrapper (PyCXX).
The original allows old and new style extension classes, but also allows one to derive from the new-style classes:
import test
// ok
a = test.new_style_class();
// also ok
class Derived( test.new_style_class() ):
def __init__( self ):
test_funcmapper.new_style_class.__init__( self )
def derived_func( self ):
print( 'derived_func' )
super().func_noargs()
def func_noargs( self ):
print( 'derived func_noargs' )
d = Derived()
The code is convoluted, and appears to contain errors (Why does PyCXX handle new-style classes in the way it does?)
My question is: What is the rationale/justification for PyCXX's convoluted mechanism? Is there a cleaner alternative?
I will attempt to detail below where I am at with this enquiry. First I will try and describe what PyCXX is doing at the moment, then I will describe what I think could maybe be improved.
When the Python runtime encounters d = Derived()
, it does PyObject_Call( ob ) where ob is the
PyTypeObjectfor
NewStyleClass. I will write
obas
NewStyleClass_PyTypeObject`.
That PyTypeObject has been constructed in C++ and registered using PyType_Ready
PyObject_Call
will invoke type_call(PyTypeObject *type, PyObject *args, PyObject *kwds)
, returning an initialised Derived instance i.e.
PyObject* derived_instance = type_call(NewStyleClass_PyTypeObject, NULL, NULL)
Something like this.
(All of this coming from (http://eli.thegreenplace.net/2012/04/16/python-object-creation-sequence by the way, thanks Eli!)
type_call does essentially:
type->tp_new(type, args, kwds);
type->tp_init(obj, args, kwds);
And our C++ wrapper has inserted functions into the tp_new
and tp_init
slots of NewStyleClass_PyTypeObject
something like this:
typeobject.set_tp_new( extension_object_new );
typeobject.set_tp_init( extension_object_init );
:
static PyObject* extension_object_new( PyTypeObject* subtype,
PyObject* args, PyObject* kwds )
{
PyObject* pyob = subtype->tp_alloc(subtype,0);
Bridge* o = reinterpret_cast<Bridge *>( pyob );
o->m_pycxx_object = nullptr;
return pyob;
}
static int extension_object_init( PyObject* _self,
PyObject* args, PyObject* kwds )
{
Bridge* self{ reinterpret_cast<Bridge*>(_self) };
// NOTE: observe this is where we invoke the constructor,
// but indirectly (i.e. through final)
self->m_pycxx_object = new FinalClass{ self, args, kwds };
return 0;
}
Note that we need to bind together the Python Derived instance, and it's corresponding C++ class instance. (Why? Explained below, see 'X'). To do that we are using:
struct Bridge
{
PyObject_HEAD // <-- a PyObject
ExtObjBase* m_pycxx_object;
}
Now this bridge raises a question. I'm very suspicious of this design.
Note how memory was allocated for this new PyObject:
PyObject* pyob = subtype->tp_alloc(subtype,0);
And then we typecast this pointer to Bridge
, and use the 4 or 8 (sizeof(void*)
) bytes immediately following the PyObject
to point to the corresponding C++ class instance (this gets hooked up in extension_object_init
as can be seen above).
Now for this to work we require:
a) subtype->tp_alloc(subtype,0)
must be allocating an extra sizeof(void*)
bytes
b) The PyObject
doesn't require any memory beyond sizeof(PyObject_HEAD)
, because if it did then this would be conflicting with the above pointer
One major question I have at this point is:
Can we guarantee that the PyObject
that the Python runtime has created for our derived_instance
does not overlap into Bridge's ExtObjBase* m_pycxx_object
field?
I will attempt to answer it: it is US determining how much memory gets allocated. When we create NewStyleClass_PyTypeObject
we feed in how much memory we want this PyTypeObject
to allocate for a new instance of this type:
template< TEMPLATE_TYPENAME FinalClass >
class ExtObjBase : public FuncMapper<FinalClass> , public ExtObjBase_noTemplate
{
protected:
static TypeObject& typeobject()
{
static TypeObject* t{ nullptr };
if( ! t )
t = new TypeObject{ sizeof(FinalClass), typeid(FinalClass).name() };
/* ^^^^^^^^^^^^^^^^^ this is the bug BTW!
The C++ Derived class instance never gets deposited
In the memory allocated by the Python runtime
(controlled by this parameter)
This value should be sizeof(Bridge) -- as pointed out
in the answer to the question linked above
return *t;
}
:
}
class TypeObject
{
private:
PyTypeObject* table;
// these tables fit into the main table via pointers
PySequenceMethods* sequence_table;
PyMappingMethods* mapping_table;
PyNumberMethods* number_table;
PyBufferProcs* buffer_table;
public:
PyTypeObject* type_object() const
{
return table;
}
// NOTE: if you define one sequence method you must define all of them except the assigns
TypeObject( size_t size_bytes, const char* default_name )
: table{ new PyTypeObject{} } // {} sets to 0
, sequence_table{}
, mapping_table{}
, number_table{}
, buffer_table{}
{
PyObject* table_as_object = reinterpret_cast<PyObject* >( table );
*table_as_object = PyObject{ _PyObject_EXTRA_INIT 1, NULL };
// ^ py_object_initializer -- NULL because type must be init'd by user
table_as_object->ob_type = _Type_Type();
// QQQ table->ob_size = 0;
table->tp_name = const_cast<char *>( default_name );
table->tp_basicsize = size_bytes;
table->tp_itemsize = 0; // sizeof(void*); // so as to store extra pointer
table->tp_dealloc = ...
You can see it going in as table->tp_basicsize
But now it seems clear to me that PyObject-s generated from NewStyleClass_PyTypeObject
will never require additional allocated memory.
Which means that this whole Bridge
mechanism is unnecessary.
And PyCXX's original technique for using PyObject as a base class of NewStyleClassCXXClass
, and initialising this base so that the Python runtime's PyObject for d = Derived()
is in fact this base, this technique is looking good. Because it allows seamless typecasting.
Whenever Python runtime calls a slot from NewStyleClass_PyTypeObject
, it will be passing a pointer to d's PyObject as the first parameter, and we can just typecast back to NewStyleClassCXXClass
. <-- 'X' (referenced above)
So really my question is: why don't we just do this? Is there something special about deriving from NewStyleClass
that forces extra allocation for the PyObject?
I realise I don't understand the creation sequence in the case of a derived class. Eli's post didn't cover that.
I suspect this may be connected with the fact that
static PyObject* extension_object_new( PyTypeObject* subtype, ...
^ this variable name is 'subtype' I don't understand this, and I wonder if this may hold the key.
EDIT: I thought of one possible explanation for why PyCXX is using sizeof(FinalClass) for initialisation. It might be a relic from an idea that got tried and discarded. i.e. If Python's tp_new call allocates enough space for the FinalClass (which has the PyObject as base), maybe a new FinalClass can be generated on that exact location using 'placement new', or some cunning reinterpret_cast business. My guess is this might have been tried, found to pose some problem, worked around, and the relic got left behind.
PyCXX is not convoluted. It does have two bugs, but they can be easily fixed without requiring significant changes to the code.
When creating a C++ wrapper for the Python API, one encounters a problem. The C++ object model and the Python new-style object model are very different. One fundamental difference is that C++ has a single constructor that both creates and initializes the object. While Python has two stages;
tp_new
creates the object and performs minimal intialization (or just returns an existing object) andtp_init
performs the rest of the initialization.PEP 253, which you should probably read in its entirety, says:
...
The entire point of a C++ wrapper is to enable you to write nice C++ code. Say for example that you want your object to have a data member that can only be initialized during its construction. If you create the object during
tp_new
, then you cannot reinitialize that data member duringtp_init
. This will probably force you to hold that data member via some kind of a smart pointer and create it duringtp_new
. This makes the code ugly.The approach PyCXX takes is to separate object construction into two:
tp_new
creates a dummy object with just a pointer to the C++ object which is createdtp_init
. This pointer is initially null.tp_init
allocates and constructs the actual C++ object, then updates the pointer in the dummy object created intp_new
to point to it. Iftp_init
is called more than once it raises a Python exception.I personally think that the overhead of this approach for my own applications is too high, but it's a legitimate approach. I have my own C++ wrapper around the Python C/API that does all the initialization in
tp_new
, which is also flawed. There doesn't appear to be a good solution for that.