I know that Undefined Behaviour, once it has happened, makes it impossible to think about the code any longer. I am convinced, completely. I even think I should not dig too much into understanding UB: a sane C++ program should not play with UB, Period.
But so as to convince my colleagues and managers about the real danger of it, I try to find a concrete example, with a bug we DO have in the product (about which they think it is not dangerous, at worst it will always crash with an access violation).
My main concern is about calling a virtual member function on dangling pointers to polymorphic class.
When a pointer is deleted, the windows OS will write a few bytes in the header of the heap block, and usually overwrites also the first bytes of the heap block itself. This is its way to keep track of heap blocks, manage them as a linked list... OS stuffs.
Though it's not defined in the C++ standard, polymorphism is implemented using virtual tables, AFAIK. Under windows, the pointer to the virtual table is located in the first bytes of the heap block, given a class that inherits only one base class. (It may be more complex with multi-inheritance, but I will not take this into account. Let's only consider base class A, and several B, C, D inheriting A).
Now let's consider I have a pointer to an A, which was instanciated as a D objects. And that D object has been deleted elsewhere in the code: so the heap block is now a free heap block, and its first bytes has been overwritten, and as a consequence the virtual table pointer is pointing almost at random somewhere in memory, let's say the address 0x01234567
.
When somewhere in the code, we call:
void test(A * pA)
{
# here we do not know that pA is dangling pointer
# that memory address has been deleted by another thread, in another part of the code
pA->SomeVirtualFunction();
}
Am I right telling that:
- the runtime will interpret the memory at address
0x01234567
as if it was a virtual table - if while interpreting falsely this memory address like a vtable, it doesn't go to forbidden memory zone, there will not be any Access Violation
- the misinterpreted virtual table will provide a random address for the virtual function to execute, let's say
0x09876543
- the memory at the random addres
0x09876543
will be interpreted as valid binary code, and EXECUTED for real - this can lead to ANY corruption imaginable
I don't want to be exaggerating, so as to convince. So, is what I'm saying is correct, possible, and likely ?
Your example is a possibility.
However, the situation is much, much worse.
If someone is attacking users of your application, then the memory will not contain random data. The attacker will try and likely manage to influence what that data will be. Once that happens, the attacker may be able to determine which code will be executed. And once that happens, unless your application is properly sandboxed (which I bet it is not with that attitude of your co-developers), the attacker may be able to take over the user's computer.
And that's not a hypothetical possibility, but something that has happened and will happen again.