Consider the following code:
#include <utility>
#include <string>
int bar() {
std::pair<int, std::string> p {
123, "Hey... no small-string optimization for me please!" };
return p.first;
}
(simplified thanks to @Jarod42 :-) ...)
I expect the function to be implemented as simply:
bar():
mov eax, 123
ret
but instead, the implementation calls operator new(), constructs an std::string with my literal, then calls operator delete(). At least - that's what gcc 9 and clang 9 do (GodBolt). Here's the clang output:
bar(): # @bar()
push rbx
sub rsp, 48
mov dword ptr [rsp + 8], 123
lea rax, [rsp + 32]
mov qword ptr [rsp + 16], rax
mov edi, 51
call operator new(unsigned long)
mov qword ptr [rsp + 16], rax
mov qword ptr [rsp + 32], 50
movups xmm0, xmmword ptr [rip + .L.str]
movups xmmword ptr [rax], xmm0
movups xmm0, xmmword ptr [rip + .L.str+16]
movups xmmword ptr [rax + 16], xmm0
movups xmm0, xmmword ptr [rip + .L.str+32]
movups xmmword ptr [rax + 32], xmm0
mov word ptr [rax + 48], 8549
mov qword ptr [rsp + 24], 50
mov byte ptr [rax + 50], 0
mov ebx, dword ptr [rsp + 8]
mov rdi, rax
call operator delete(void*)
mov eax, ebx
add rsp, 48
pop rbx
ret
.L.str:
.asciz "Hey... no small-string optimization for me please!"
My question is: Clearly, the compiler has full knowledge of everything going on inside bar(). Why is it not "eliding"/optimizing the string away? More specifically:
- At the basic level there's the code between then
new()anddelete(), which AFAICT the compiler knows results in nothing useful. - Secondarily, the
new()anddelete()calls themselves. After all, small-string-optimization is allowed by the standard AFAIK, so even though clang/gcc hasn't chosen to use that - it could have; meaning that it's not actually required to callnew()ordelete()there.
I'm particularly interested in what part of this is directly due to the language standard, and what part is compiler non-optimality.
Following the discussion in various answers and comments here, I have now filed the following bugs against GCC and LLVM regarding this issue:
GCC bug 94293: [missed optimization] new+delete of unused local string not removed
Minimal testcase (GodBolt):
GCC bug 94294: [missed optimization] Useless statements populating local string not removed
Minimal testcase (GodBolt):
LLVM bug 45287: [missed optimization] failure to drop unused libstdc++ std::string.
Minimal testcase (GodBolt):
Thanks goes to: @JeffGarret, @NicolBolas, @Jarod42, Marc Glisse .
Update, August 2021: With recent versions of clang++, g++ and libstc++, all of these minimal testcases eschew memory allocation as one would expect. clang++ also has this behavior for OP's program in the question, but GCC still allocates and deallocates.