So both GCC and Clang are smart enough to optimize printf("%s\n", "foo")
to puts("foo")
(GCC, Clang). That's good and all.
But when I run this function through Compiler Explorer:
#include <stdio.h>
void foo(void) {
printf("%s", "foo");
}
Neither GCC nor Clang optimize printf("%s", "foo")
to fputs("foo", stdout)
, which I believe should be identical (since fputs
doesn't put the newline like puts
) to the printf
and faster.
x86-64 GCC 11.1 (link):
.LC0:
.string "foo"
.LC1:
.string "%s"
foo:
mov esi, OFFSET FLAT:.LC0
mov edi, OFFSET FLAT:.LC1
xor eax, eax
jmp printf
x86-64 Clang 12.0.0 (link):
foo: # @foo
mov edi, offset .L.str
mov esi, offset .L.str.1
xor eax, eax
jmp printf # TAILCALL
.L.str:
.asciz "%s"
.L.str.1:
.asciz "foo"
Is there any reason specific reason for no optimization to fputs
, or is the compiler not smart enough?
Some very specific situations are optimized, like the one you showed, but it's very superficial, if you add something to your format string, even a space, it immediately discards the
puts
and goes back toprintf
.I guess that there would be nothing to stop a more broad optimization, my speculation is that, since the performance gains are not that great, further adding more special cases was deemed as not being worth it.
In my speculation, the lack of
fputs
optimization would fall in that not being worth it category.This old gcc
printf
optimization document sheds some light on these optimizations, I doubt that it would much different today.Specifically: