I am currently designing a programming language and I'm curious on how to solve this problem:
Suppose that I have a class (or interface) A that looks like this:
class A { // size is 4 bytes
int32 a = 0;
}
and a second class B that extends it and looks like this:
class B extends A { // size is 8 bytes
int32 b = 0;
}
and that I have a function f that looks like this:
int32 f(A first, A second) {
return first.a + second.a;
}
If I call it with two Bs, however, second.a would not be at the same location as if it were called with two As because the first parameter would shift it. My current thoughts for solving this are:
- Disallowing unknown-size parameters, and forcing it to be passed as a pointer or reference (I think this is what Rust does)
- Writing all of the following information to the call stack: pointer to second, pointer to after second, non-variable size params, first, second
- Creating a function for each possible size of first and second and determining which one to call at compile time, if known, or at runtime using vtables.
The second idea would be a problem because it would need to be supported by all functions, even if they're rarely or never called using a subtype, which is inefficient.
The third idea would require a lot of functions to be created (a function that accepts 5 params which can be of 20 different subtypes would require 100 similar pieces of code to be generated, if it's called just one with unknown-type params), and would require a vtable for every class that has just one function using it. Also, a function in an already-compiled library could not be used with new subtype.
Combining 2 and 3 and creating two versions of the same functions, one that accepts only the type and another that accepts subtypes too could solve a few of these problems.
I'm curious about whether there are better solutions to this, and how other languages such as C++ implement this.
In C++, calling
f(A)
with a parameter of subtypeB
by value is equivalent toThe
static_cast
can result in the memory at the same address just reinterpreted as a start of a shorter block of data, or in transparently adding some offset first (ifA
is not the first base class or is virtual). After that, internally, a copy constructor ofA
is called. In either case, the information that was added byB
is lost entirely, along with overrides of virtual functions. For all purposes, what is passed is no longer aB
.Dynamic polymorphism needs a reference or a pointer, much for the reasons you outline. But if you'd like to pass a "reference by value", the simplest solution would probably be passing a reference to a copy of the object. Note that in such case each object would need to "know" what type it is to call the correct copy constructor, or be derived from a common superclass and implement some form of
clone()
.