compiler's detail of this pointer and virtual functions

385 views Asked by At

I'm reading Bjarne's paper: "Multiple Inheritance for C++".

In section 3, page 370, Bjarne said that "The compiler turns a call of a member function into an "ordinary" function call with an "extra" argument; that "extra" argument is a pointer to the object for which the member function is called."

Consider a simple class A:

class A {
    int a;
    void f(int i);
};

A call of the member function A::f:

A* pa;
pa->f(2)

is transformed by the compiler to an "ordinary function call":

f__F1A(pa, 2)

pa is passed as the this pointer. It's easy to understand for the above example.

Consider the following code snippet:

class A {int a; void f(int);};
class B : A {int b; void g(int);};
class C : B {int c; void h(int);};

Question 1:

A call of the member function A::f:

C* pc = new C;
pc->g(int)

is transformed by the compiler to an "ordinary function call":

g__G1C(pc, int) or g__G1B((*B)pc, int)

Is the this pointer a *pc or (*B)pc? Another question is how the compile knows where the member functions are?

Let's make the above example more interesting by adding the virtual keyword.

class A {
    int a;
    virtual void f(int);
    virtual void g(int);
    virtual void h(int);
};
class B : A {int b; void g(int); };
class C : B {int c; void h(int); };

A class c object C looks like:

C:

-----------                vtbl:
+0:  vptr -------------->  -----------
+4:  a                     +0: A::f
+8:  b                     +4: B::g
+12: c                     +8: C::h
-----------                -----------  

A call to a virtual function is transformed into an indirect call by the compiler. For example,

C* pc;
pc->g(2)

becomes something like:

(*((*pc)[1]))(pc, 2)

The Bjarne's paper tole me the above conclusion.

Question 2:

(1) In the vtbl, I believe these function pointers are assigned during the runtime. How does the compiler know the second function pointer should point to the class B's implementation of the g? How the compiler figures it?

(2) In the above example, all members are int and we assume that the compiler assigns 4 bytes memory for the int. What if the member is char, does the compiler still assigns 4 bytes memory for the char? Or just one byte?

(3) (*((*pc)[1]))(pc, 2), the this pointer here is a pc, why not (*B)pc? Is there any rule for the passing this pointer?

Can anyone help me answer these questions? I really appreciate it. I have a deadline tomorrow which is really relative to these problems. Please help!!!

1

There are 1 answers

3
user207421 On

Question 1:

A call of the member function A::f:

C* pc = new C;
pc->g(int)

This isn't a call to A::f(). It is a call to B::g().

is transformed by the compiler to an "ordinary function call":

g__G1C(pc, int) or g__G1B((*B)pc, int)

Is the this pointer a *pc or (*B)pc?

Neither. It is a B*.

Another question is how the compile knows where the member functions are?

It doesn't. It knows their names. The linker assigns their addresses.

Question 2:

(1) How does the compiler know the second function pointer should point to the class B's implementation of the g? How the compiler figures it?

Because it is C's vtbl, and C inherits from B, and B has the nearest definition of g().

(2) In the above example, all members are int and we assume that the compiler assigns 4 bytes memory for the int. What if the member is char, does the compiler still assigns 4 bytes memory for the char? Or just one byte?

It depends on the alignment and packing rules of the processor, the compiler, compiler options, surrounding #pragmas, etc.

(3) (*((*pc)[1]))(pc, 2), the this pointer here is a pc, why not (*B)pc?

This supposition contradicts your other question. It is B*.

Is there any rule for the passing this pointer?

See above.