Operator overloading: Modify temporary object or create new one

204 views Asked by At

I've seen the following code in our project and asked myself what are the technical and mental implications:

class A {
public:
    A(const A&);
    A(A &&);
    ~A();

    A &operator += (const A &);
    A operator + (const A &);
private:
    class B;
    B *b;
};

A factory();
void sink(const A &);

void foo(const A &x) {
    sink(factory() += x); // <--
}

In the highlighted line I expected sink(factory() + x). I remarked that in the review process and got the reply that the current implementation is more efficient since no second temporary object is created. As a self respecting nerd I immediately fact checked this statement here and got the following x86_64 assembly output.

foo(A const&):
   push    rbx
    mov     rbx, rdi
    sub     rsp, 16
    lea     rdi, [rsp+8]
    call    factory()
    mov     rsi, rbx
    lea     rdi, [rsp+8]
    call    A::operator+=(A const&)
    mov     rdi, rax
    call    sink(A const&)
    lea     rdi, [rsp+8]
    call    A::~A()
    add     rsp, 16
    pop     rbx
    ret

Versus

bar(A const&):
    push    rbx
    mov     rbx, rdi
    sub     rsp, 16
    mov     rdi, rsp
    call    factory()
    mov     rdx, rbx
    mov     rsi, rsp
    lea     rdi, [rsp+8]
    call    A::operator+(A const&)
    lea     rdi, [rsp+8]
    call    sink(A const&)
    lea     rdi, [rsp+8]
    call    A::~A() [complete object destructor]
    mov     rdi, rsp
    call    A::~A() [complete object destructor]
    add     rsp, 16
    pop     rbx
    ret

You obviously save one temporary object! But intuitively I still would prefer the plus operator and would've hoped that the compiler is able to optimize the temporary objects away by constructing those objects in place such that no second temporary object must be destructed.

What is your take on this "optimization"? Are there other pro and counter arguments? Is there a possible way to save the plus operator approach? I always thought that move semantics will handle such things, it will prevent all those temporary objects, saves us from the proxy objects from the past, lets us write more direct and functional code and much much more...

2

There are 2 answers

1
Alan Birtles On

The compiler doesn't know how A is implemented so it can't know if any of the constructors, destructors or operators have side effects so it can't prove that optimising them away fulfils the "as if" rules, it therefore has to leave your code exactly as written.

2
fabian On

You could introduce the following 4 overloads at namespace scope (or add & and && modifiers for the member operators) and make use of the fact that the expression factory() results in an overload taking A&& to be preferred to one taking A const&. This at least in the case where one of the operands is a rvalue reference allows you to return this reference. Note that this could result in a nasty surprise, if you try to extend the lifetime of the result of the + operation by binding it to a reference.

Note that the operators for simplicity don't contain any meaningful implementation in the following code, but just demonstrate the possibility of reusing an object passed to the operand, if it's an rvalue.

struct A
{
    int m_dummy{ 1 };

    friend constexpr A operator+(A const& s1, A const& s2)
    {
        return {};
    }

    friend constexpr A&& operator+(A&& s1, A const& s2)
    {
        return std::move(s1);
    }

    friend constexpr A&& operator+(A&& s1, A&& s2)
    {
        return std::move(s1);
    }

    friend constexpr A&& operator+(A const& s1, A&& s2)
    {
        return std::move(s2);
    }
};

A const* address(A const& a)
{
    return &a;
}

int main() {
    std::cout << std::boolalpha;

    A s1;
    A s2;

    // none of the operands is an rvalue -> create a new temporary
    std::cout << (address(s1 + s2) == &s1) << '\n'; // false
    std::cout << (address(s1 + s2) == &s2) << '\n'; // false

    // same scenario as sink(factory() + x); here...
    // the first operand is an rvalue
    std::cout << (address(std::move(s1) + s2) == &s1) << '\n'; // true

    // second operand is an rvalue
    std::cout << (address(s1 + std::move(s2)) == &s2) << '\n'; // true

    // both operands are rvalues
    std::cout << (address(std::move(s1) + std::move(s2)) == &s1) << '\n'; // true
}

Usually you wouldn't expect the following code to result in undefined behaviour though, which is why I wouldn't overloading the operator like this. (You could potentially avoid this issue by implementing move semantics for A and returning by value; this would however result in the move construction of a new instance of A.)

int main() {
    A const& value = A{} + A{};
    std::cout << value.m_dummy << '\n'; // reading from a dangling reference here -> UB
}