Why does Return Value Optimization not happen if no destructor is defined?

690 views Asked by At

I expected to see copy elision from Named Return Value Optimization (NRVO) from this test program but its output is "Addresses do not match!" so NRVO didn't happen. Why is this?

// test.cpp
// Compile using:
//      g++ -Wall -std=c++17 -o test test.cpp
#include <string>
#include <iostream>

void *addr = NULL;

class A
{
public:
    int i;
    int j;

#if 0
    ~A() {}
#endif
};

A fn()
{
    A fn_a;

    addr = &fn_a;

    return fn_a;
}

int main()
{
    A a = fn();

    if (addr == &a)
        std::cout << "Addresses match!\n";
    else
        std::cout << "Addresses do not match!\n";
}

Notes:

  1. If a destructor is defined by enabling the #if above, then the NRVO does happen (and it also happens in some other cases such as defining a virtual method or adding a std::string member).

  2. No methods have been defined so A is a POD struct, or in more recent terminology a trivial class. I don't see an explicit exclusion for this in the above links.

  3. Adding compiler optimisation (to a more complicated example that doesn't just reduce to the empty program!) doesn't make any difference.

  4. Looking at the assembly for a second example shows that this even happens when I would expect mandatory Return Value Optimization (RVO), so the NRVO above was not prevented by taking the address of fn_a in fn(). Clang, GCC, ICC and MSVC on x86-64 show the same behaviour suggesting this behaviour is intentional and not a bug in a specific compiler.

     class A
     {
     public:
         int i;
         int j;
    
     #if 0
         ~A() {}
     #endif
     };
    
     A fn()
     {
         return A();
     }
    
     int main()
     {
         // Where NRVO occurs the call to fn() is preceded on x86-64 by a move
         // to RDI, otherwise it is followed by a move from RAX.
         A a = fn();
     }
    
2

There are 2 answers

1
eerorika On BEST ANSWER

The language rule which allows this in case of returning a prvalue (the second example) is:

[class.temporary]

When an object of class type X is passed to or returned from a function, if X has at least one eligible copy or move constructor ([special]), each such constructor is trivial, and the destructor of X is either trivial or deleted, implementations are permitted to create a temporary object to hold the function parameter or result object. The temporary object is constructed from the function argument or return value, respectively, and the function's parameter or return object is initialized as if by using the eligible trivial constructor to copy the temporary (even if that constructor is inaccessible or would not be selected by overload resolution to perform a copy or move of the object). [Note: This latitude is granted to allow objects of class type to be passed to or returned from functions in registers. — end note ]


Why does Return Value Optimization not happen [in some cases]?

The motivation for the rule is explained in the note of the quoted rule. Essentially, RVO is sometimes less efficient than no RVO.

If a destructor is defined by enabling the #if above, then the RVO does happen (and it also happens in some other cases such as defining a virtual method or adding a std::string member).

In the second case, this is explained by the rule because creating the temporary is only allowed when the destructor is trivial.

In the NRVO case, I suppose this is up to the language implementation.

0
Nicol Bolas On

On many ABIs, if a return value is a trivially copyable object whose size/alignment is equal to or less than that of a pointer/register, then the ABI will not permit elision. The reason being that it is more efficient to just return the value via a register than via a stack memory address.

Note that when you get the address either of the object in the function or the returned object, the compiler will force the object onto the stack. But the actual passing of the object will be via a register.