Virtual inheritance example in C, exploiting undefined behavior?

169 views Asked by At

In a series of articles, Dan Saks introduces a possible implementation of virtual functions in C. Relying more on static type-checking, this is a different approach as opposed to the solution of A.-T. Schreiner with void * pointers and dynamic type-checking.

Here is a stripped-down example without the vptrs and vtables of Saks' version (for the sake of simplicity, function pointers are just members of struct Base and struct Derived).

#include <stdlib.h>
#include <stdio.h>

typedef struct Base Base;

// Base "class"
struct Base {
    int (*get_param)(Base const *self);
};

inline int Base_get_param(Base const *self)
{
    return  self->get_param(self);
}

typedef struct Derived Derived;

// Derived "class"
struct Derived {
    int (*get_param)(Derived const *self);
    int param;
};

Derived * Derived_new(int param)
{
    Derived *self = malloc(sizeof(Derived));
    if (!self) abort();
    self->get_param = Derived_get_param;
    self->param = param;
    return self;
}

void Derived_delete(Derived *self)
{
    free(self);
}

inline int Derived_get_param(Derived const *self)
{
    return self->param;
}


int main()
{
    Derived *d = Derived_new(5);
    printf("%d\n", Derived_get_param(d));
    printf("%d\n", Base_get_param((Base *) d));  // <== undefined behavior?
    Derived_delete(d);
    return 0;
}

The gist is the function call (and cast) Base_get_param((Base *) d). Does this mean that the function pointer int (*get_param)(Derived const *self) gets "implicitly cast" to int (*get_param)(Base const *self)? Am I exploiting undefined behavior here (according to the C99 and C11 standards) because of incompatible types?

I get the proper output both with GCC 4.8 and clang 3.4. Is there a situation where the above implementation might be broken?

There is a detailed answer here about function pointer casts and compatible types but I am not sure about this case.

1

There are 1 answers

2
Shafik Yaghmour On BEST ANSWER

This program does indeed invoke undefined behavior, you have a violation of strict aliasing rules here:

printf("%d\n", Base_get_param((Base *) d));
                               ^^^^^^^^^

Strict aliasing rules make it undefined behavior to access an object through a pointer of a different type, although there is an exception for char* which we are allowed to use to alias without invoking undefined behavior.

Basically the compiler can optimize around the assumption that pointers of different types do not point to the same memory. Once you you invoke undefined behavior the result of your program becomes unpredictable.

Practically in other cases such as in this question I could not get the compiler to do the wrong thing but in other more complicated cases things may go wrong. See gcc, strict-aliasing, and horror stories for cases where it did cause issues. The article Type Punning, Strict Aliasing, and Optimization provides the following code:

#include <stdio.h>

void check (int *h, long *k)
{
  *h = 5;
  *k = 6;
  printf("%d\n", *h);
}

int main (void)
{
  long k;
  check((int *)&k, &k);
  return 0;
}

which violates strict aliasing and produces different outputs using -O1 Vs -O2.

Strict aliasing for gcc can be turned off using -fno-strict-aliasing and perhaps the author is making such an assumption although I could not find that in the article. This does disable some optimizations so it is not a costless flag.