virtual member functions are good or bad for locality in modern CPUs?

253 views Asked by At

Considering the new CPUs with new instructions for moving and new memory controllers, if in C++ I have a vector of Derived objects where Derived is composed of virtual member functions, is this a good or a bad thing for the locality ?

And what if I have a vector of pointers to the base class Base* where I store references to derived objects that are 1-2-3 level up from Base ?

Basically dynamic typing applies to both cases, but which one is better for caching and memory access ?

I have a preference between this 2 but I would like to see a complete answer on the subject.

There is something new to consider as ground-braking from the hardware industry in the last 2-3 years ?

2

There are 2 answers

3
AudioBubble On BEST ANSWER

Storing Derived rather than Base * in a vector is better because it eliminates one extra level of indirection and you have all objects laid out «together» in a continuous memory, which in turn makes life easier for a hardware prefetcher, helps with paging, TLB misses, etc. However, if you do this, make sure you don't introduce a slicing problem.

As for the virtual dispatch in this case, it almost does not matter with an exception of adjustment required for «this» pointer. For example, if Derived overrides a virtual function that you are calling and you already have a pointer to Devied *, then «this» adjustment is not required, and otherwise it should be adjusted to one of the base class`s «this» value (this also depends on size of the classes in inheritance hierarchy).

As long as all classes in a vector have the same overloads, CPU would be able to predict what's going on. However, if you have a mix of different implementations, then CPU would have no clue as to what function will be called for every next object, and that might cause performance issues.

And don't forget to always profile before and after you make changes.

4
Mats Petersson On

Modern CPU's know how to optimise data-dependent jump instructions, as well as it can for data dependent "branch" instructions - the processor will "learn" that "Last time I went through here, I went THIS way", and if it has enough confidence (gone through several times with the same result) it ill keep going that way.

Of course that doesn't help if the instances are a complete random selection of different classes that each have it's own virtual function.

Cache-locality is of course a slightly different matter, and it really depends on whether you are storing the object instances or the pointers/references to instances in the vector.

And of course, an important factor is "what is the alternative?" - if you are using virtual functions "correctly", it means that there is (at least) one less conditional check in a code-path, because the decision was taken at a much earlier stage. That condition would be (assuming the probability corresponds the same) to the branch probability of the decision, if you solve it by some other method - which will be at least as bad for performance as virtual functions with the same probability (chances are that it's worse, because we now have a if (x) foo(); else bar(); type scenario, so we first have to evaluate x then choose the path. obj->vfunc() will just be unpredictable because fetching for the vtable gives an unpredictable result - but at least the vtable itself is cached.