I have my custom little OOP-esque inheritance functionality, something like this:
// base class
struct BaseTag;
typedef struct {
int (*DoAwesomeStuff)(struct BaseTag* pInstance);
} S_BaseVtable;
typedef struct BaseTag{
S_BaseVtable* pVtable;
int AwesomeValue;
} S_Base;
// child class
struct ChildTag;
typedef struct {
S_BaseVtable Base;
void (*SomeOtherStuff)(struct ChildTag* pInstance);
} S_ChildVTable;
typedef struct ChildTag {
S_Base BaseClass;
int EvenAwesomerValue;
} S_Child;
Now let's say I have a Child class constructor where the Base class vtable is overridden with the child vtable:
void Child_ctor(S_Child* pInstance) {
Base_ctor((S_Base*) pInstance);
pInstance.BaseClass.pVtable = (S_BaseVtable*) &MyChildVTable;
}
Also in this child vtable, I want to override the DoAwesomeStuff()
method from base class with a method like this:
int Child_DoAwesomeStuff(struct BaseTag* pInstance) {
S_Child* pChild = (S_Child*) pInstance; // undefined behaviour
return pChild->EvenAwesomerValue;
}
I have seen this pattern in variations occasionally, but I see some problems with it. My main questions are
- How can I access the
S_ChildVtable
from a child instance that is hidden behind aS_BaseVtable
pointer? - How can I properly cast the
pInstance
argument ofChild_DoAwesomeStuff()
to anS_Child*
type?
As far as I understand the C standard, casting from S_Child*
to S_Base*
(and the corresponding vtable types) is okay as the first member of S_Child
is an S_Base
instance. But vice versa it is undefined behaviour.
Would something like S_Child* pChild = (S_Child*)((char*) pInstance)
be legal and defined?
Edit
My question was a bit unclear and misleading. It's not the cast itself that I think is UB, but dereferencing pChild after it was cast from pInstance.
I browsed through the C11 standard again to find some reference but not it's not so clear to me anymore.
6.3.2.3/7:
A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned (68) for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.
So I guess my question really is - What mechanics need to be in place so that it is ensured that S_Base and S_Child are correctly aligned?
Undefined behavior: memory usage in C, when it is or is not undefined behavior
1st, some background study: let us understand what is and is not undefined behavior when managing memory in C
As is frequently the case in programming, there are a lot of nuances to discuss. So, let me try to address the edits to your question.
In C, casting is undefined behavior for a variety of reasons, but not in the casts you are doing in your question. See the comments below this answer for more insight.
Dereferencing is undefined behavior for a few reasons as well, including these two main ones I will talk about which may be most relevant to your question:
Consider the following examples:
Example 1: pointing to memory our program does not own is undefined behavior
Undefined behavior: on any machine
NOT undefined behavior: on an ATmega328 8-bit microcontroller (ex: Arduino Uno)
Note that the proper way to do the above is this (example file: "/Arduino 1.8.13/hardware/tools/avr/avr/include/avr/iom328pb.h"):
Example 2: using memory we don't own, and/or that is uninitialized, is undefined behavior
Example 3: using a memory pool our program does own is not undefined behavior
Now, with the above knowledge learned, let's look back at your question:
The answer to this is: "it depends on whether or not you dereferencing valid (owned, and already-initialized if reading it) vs invalid (not owned, or not initialized) memory.
Consider the following:
Let's go deeper to explore the 1st case where there is maybe undefined behavior.
So, is the
(S_Child*)pBase;
cast undefined behavior? No! But it is dangerous! Is accessing owned memory withinpChild
undefined behavior? No! We own it. Our program allocated it. But, is accessing memory outside what our program owns (ex:pChild->EvenAwesomerValue
) undefined behavior? Yes! We do not own that memory. It is similar to the many undefined cases I went through above.C++ has solved the dangerous behavior above by having the
dynamic_cast<>()
conversion which will allow casting a parent type to a child type. It will then dynamically, at run-time, check to see if the resulting object "is a valid complete object of the target type". If it discovers it is not, it sets the resulting pointer tonullptr
to notify you of that. In C, you have to just track these things manually yourself."What mechanics need to be in place so that it is ensured that
S_Base
(parent) andS_Child
are correctly aligned?"This one's easy: just put your
S_Base
struct at the very beginning of yourS_Child
struct and they are automatically aligned. Now, a pointer to yourS_Child
object points to the exact same address as a pointer to theS_Base
object within it, since the child contains the base object.They are automatically aligned so long as you don't use any alignment or padding keywords or compiler extensions to change things. Padding is automatically added by the compiler after struct members, as needed, never before the first member. See more on that here: Structure padding and packing.
Simple example (without any virtual table polymorphism function stuff):
For the last (dangerous) cast above, C++ would allow you to have a dynamic cast which would fail at runtime if and only if you called it with C++ dynamic_cast syntax, and checked for errors, like this:
Key takeaway:
Once you first get alignment by putting the parent right at the beginning inside the child, basically just think of each object as a memory blob, or memory pool. If the memory pool you have (are pointing to) is larger than the expected size based on the pointer type pointing to it, you're fine! Your program owns that memory. But, if the memory pool you have (are pointing to) is smaller than the expected size based on the pointer type pointint to it, you're not fine! Accessing memory outside your allocated memory blob is undefined behavior.
In the case of OOP and parent/child relationships, the child object must always be larger than the parent object because it contains a parent object within it. So, casting a child to a parent type is fine, since the child type is larger than the parent type and the child type holds the parent type first in its memory, but casting a parent type to a child type is not fine unless the memory blob being pointed to was created initially as a child of that child type.
Now, let's look at this in C++ and compare to your C example.
Inheritance and parent <--> child type casting in C++ and C
So long as the
pInstance
pointer being passed toChild_DoAwesomeStuff()
was actually constructed initially as anS_Child
object, then casting the pointer back to anS_Child
pointer (S_Child*
) is not undefined behavior. It would only be undefined behavior if you attempt to cast a pointer to an object that was constructed originally as astruct BaseTag
(akaS_Base
) type to a child pointer type.This is how C++ works too, with
dynamic_cast<>()
(which I mention in my answer here).Example C++ code from https://cplusplus.com/doc/tutorial/typecasting/ under the "dynamic_cast" section is below.
In the C++ code below, notice that both
pba
andpbb
are pointers to the base type (Base *
), yet,pba
is actually constructed as aDerived
(child) type vianew Derived
, whereaspbb
is actually constructed as aBase
(base, or parent) type vianew Base
.Therefore, casting
pba
toDerived*
is perfectly valid, since it truly is that type, but castingpbb
toDerived*
is not valid, since it is not truly that type. C++'sdynamic_cast<Derived*>(pbb)
call catches this undefined behavior at run-time, detecting that the returned type is not a fully-formedDerived
type, and returns anullptr
, which is equal to0
, so you get the print that saysNull pointer on second type-cast.
Here is that C++ code:
Output:
Similarly, your C code has the same behavior.
Doing this is valid:
But doing this is not ok:
So, for your specific function:
This is fine:
But this is not!:
My thoughts on enforcing OoP (Object Oriented Programming) and inheritance in C
Just a warning though: passing around pointers and storing pointers to vtables and functions and things inside C structs will make tracing your code and trying to understand it very difficult! No indexer that I am aware of (Eclipse included, and Eclipse has the best indexer I've ever seen), can trace back to which function or type was assigned to a pointer in your code. Unless you're doing this stuff just for a learning exercise, or to bootstrap your own C++ language from scratch in C (again, for learning), I recommend against these patterns.
If you want "object-oriented" C with inheritance and all, don't do it. If you want "object-based" C, via opaque pointers/structs for basic private-member encapsulation and data hiding, that's just fine! Here's how I prefer to do that: Option 1.5 ("Object-based" C Architecture).
Last note: you probably know more about virtual tables (vtables) than I do. At the end of the day, it's your code, so do whichever architecture you want, but I don't want to be working in that code base :).
See also