I have my custom little OOP-esque inheritance functionality, something like this:
// base class
struct BaseTag;
typedef struct {
int (*DoAwesomeStuff)(struct BaseTag* pInstance);
} S_BaseVtable;
typedef struct BaseTag{
S_BaseVtable* pVtable;
int AwesomeValue;
} S_Base;
// child class
struct ChildTag;
typedef struct {
S_BaseVtable Base;
void (*SomeOtherStuff)(struct ChildTag* pInstance);
} S_ChildVTable;
typedef struct ChildTag {
S_Base BaseClass;
int EvenAwesomerValue;
} S_Child;
Now let's say I have a Child class constructor where the Base class vtable is overridden with the child vtable:
void Child_ctor(S_Child* pInstance) {
Base_ctor((S_Base*) pInstance);
pInstance.BaseClass.pVtable = (S_BaseVtable*) &MyChildVTable;
}
Also in this child vtable, I want to override the DoAwesomeStuff() method from base class with a method like this:
int Child_DoAwesomeStuff(struct BaseTag* pInstance) {
S_Child* pChild = (S_Child*) pInstance; // undefined behaviour
return pChild->EvenAwesomerValue;
}
I have seen this pattern in variations occasionally, but I see some problems with it. My main questions are
- How can I access the
S_ChildVtablefrom a child instance that is hidden behind aS_BaseVtablepointer? - How can I properly cast the
pInstanceargument ofChild_DoAwesomeStuff()to anS_Child*type?
As far as I understand the C standard, casting from S_Child* to S_Base* (and the corresponding vtable types) is okay as the first member of S_Child is an S_Base instance. But vice versa it is undefined behaviour.
Would something like S_Child* pChild = (S_Child*)((char*) pInstance) be legal and defined?
Edit
My question was a bit unclear and misleading. It's not the cast itself that I think is UB, but dereferencing pChild after it was cast from pInstance.
I browsed through the C11 standard again to find some reference but not it's not so clear to me anymore.
6.3.2.3/7:
A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned (68) for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.
So I guess my question really is - What mechanics need to be in place so that it is ensured that S_Base and S_Child are correctly aligned?
Undefined behavior: memory usage in C, when it is or is not undefined behavior
1st, some background study: let us understand what is and is not undefined behavior when managing memory in C
As is frequently the case in programming, there are a lot of nuances to discuss. So, let me try to address the edits to your question.
In C, casting is undefined behavior for a variety of reasons, but not in the casts you are doing in your question. See the comments below this answer for more insight.
Dereferencing is undefined behavior for a few reasons as well, including these two main ones I will talk about which may be most relevant to your question:
Consider the following examples:
Example 1: pointing to memory our program does not own is undefined behavior
Undefined behavior: on any machine
NOT undefined behavior: on an ATmega328 8-bit microcontroller (ex: Arduino Uno)
Note that the proper way to do the above is this (example file: "/Arduino 1.8.13/hardware/tools/avr/avr/include/avr/iom328pb.h"):
Example 2: using memory we don't own, and/or that is uninitialized, is undefined behavior
Example 3: using a memory pool our program does own is not undefined behavior
Now, with the above knowledge learned, let's look back at your question:
The answer to this is: "it depends on whether or not you dereferencing valid (owned, and already-initialized if reading it) vs invalid (not owned, or not initialized) memory.
Consider the following:
Let's go deeper to explore the 1st case where there is maybe undefined behavior.
So, is the
(S_Child*)pBase;cast undefined behavior? No! But it is dangerous! Is accessing owned memory withinpChildundefined behavior? No! We own it. Our program allocated it. But, is accessing memory outside what our program owns (ex:pChild->EvenAwesomerValue) undefined behavior? Yes! We do not own that memory. It is similar to the many undefined cases I went through above.C++ has solved the dangerous behavior above by having the
dynamic_cast<>()conversion which will allow casting a parent type to a child type. It will then dynamically, at run-time, check to see if the resulting object "is a valid complete object of the target type". If it discovers it is not, it sets the resulting pointer tonullptrto notify you of that. In C, you have to just track these things manually yourself."What mechanics need to be in place so that it is ensured that
S_Base(parent) andS_Childare correctly aligned?"This one's easy: just put your
S_Basestruct at the very beginning of yourS_Childstruct and they are automatically aligned. Now, a pointer to yourS_Childobject points to the exact same address as a pointer to theS_Baseobject within it, since the child contains the base object.They are automatically aligned so long as you don't use any alignment or padding keywords or compiler extensions to change things. Padding is automatically added by the compiler after struct members, as needed, never before the first member. See more on that here: Structure padding and packing.
Simple example (without any virtual table polymorphism function stuff):
For the last (dangerous) cast above, C++ would allow you to have a dynamic cast which would fail at runtime if and only if you called it with C++ dynamic_cast syntax, and checked for errors, like this:
Key takeaway:
Once you first get alignment by putting the parent right at the beginning inside the child, basically just think of each object as a memory blob, or memory pool. If the memory pool you have (are pointing to) is larger than the expected size based on the pointer type pointing to it, you're fine! Your program owns that memory. But, if the memory pool you have (are pointing to) is smaller than the expected size based on the pointer type pointint to it, you're not fine! Accessing memory outside your allocated memory blob is undefined behavior.
In the case of OOP and parent/child relationships, the child object must always be larger than the parent object because it contains a parent object within it. So, casting a child to a parent type is fine, since the child type is larger than the parent type and the child type holds the parent type first in its memory, but casting a parent type to a child type is not fine unless the memory blob being pointed to was created initially as a child of that child type.
Now, let's look at this in C++ and compare to your C example.
Inheritance and parent <--> child type casting in C++ and C
So long as the
pInstancepointer being passed toChild_DoAwesomeStuff()was actually constructed initially as anS_Childobject, then casting the pointer back to anS_Childpointer (S_Child*) is not undefined behavior. It would only be undefined behavior if you attempt to cast a pointer to an object that was constructed originally as astruct BaseTag(akaS_Base) type to a child pointer type.This is how C++ works too, with
dynamic_cast<>()(which I mention in my answer here).Example C++ code from https://cplusplus.com/doc/tutorial/typecasting/ under the "dynamic_cast" section is below.
In the C++ code below, notice that both
pbaandpbbare pointers to the base type (Base *), yet,pbais actually constructed as aDerived(child) type vianew Derived, whereaspbbis actually constructed as aBase(base, or parent) type vianew Base.Therefore, casting
pbatoDerived*is perfectly valid, since it truly is that type, but castingpbbtoDerived*is not valid, since it is not truly that type. C++'sdynamic_cast<Derived*>(pbb)call catches this undefined behavior at run-time, detecting that the returned type is not a fully-formedDerivedtype, and returns anullptr, which is equal to0, so you get the print that saysNull pointer on second type-cast.Here is that C++ code:
Output:
Similarly, your C code has the same behavior.
Doing this is valid:
But doing this is not ok:
So, for your specific function:
This is fine:
But this is not!:
My thoughts on enforcing OoP (Object Oriented Programming) and inheritance in C
Just a warning though: passing around pointers and storing pointers to vtables and functions and things inside C structs will make tracing your code and trying to understand it very difficult! No indexer that I am aware of (Eclipse included, and Eclipse has the best indexer I've ever seen), can trace back to which function or type was assigned to a pointer in your code. Unless you're doing this stuff just for a learning exercise, or to bootstrap your own C++ language from scratch in C (again, for learning), I recommend against these patterns.
If you want "object-oriented" C with inheritance and all, don't do it. If you want "object-based" C, via opaque pointers/structs for basic private-member encapsulation and data hiding, that's just fine! Here's how I prefer to do that: Option 1.5 ("Object-based" C Architecture).
Last note: you probably know more about virtual tables (vtables) than I do. At the end of the day, it's your code, so do whichever architecture you want, but I don't want to be working in that code base :).
See also