So both GCC and Clang are smart enough to optimize printf("%s\n", "foo") to puts("foo") (GCC, Clang). That's good and all.
But when I run this function through Compiler Explorer:
#include <stdio.h>
void foo(void) {
    printf("%s", "foo");
}
Neither GCC nor Clang optimize printf("%s", "foo") to fputs("foo", stdout), which I believe should be identical (since fputs doesn't put the newline like puts) to the printf and faster.
x86-64 GCC 11.1 (link):
.LC0:
        .string "foo"
.LC1:
        .string "%s"
foo:
        mov     esi, OFFSET FLAT:.LC0
        mov     edi, OFFSET FLAT:.LC1
        xor     eax, eax
        jmp     printf
x86-64 Clang 12.0.0 (link):
foo:                                    # @foo
        mov     edi, offset .L.str
        mov     esi, offset .L.str.1
        xor     eax, eax
        jmp     printf                          # TAILCALL
.L.str:
        .asciz  "%s"
.L.str.1:
        .asciz  "foo"
Is there any reason specific reason for no optimization to fputs, or is the compiler not smart enough?
 
                        
Some very specific situations are optimized, like the one you showed, but it's very superficial, if you add something to your format string, even a space, it immediately discards the
putsand goes back toprintf.I guess that there would be nothing to stop a more broad optimization, my speculation is that, since the performance gains are not that great, further adding more special cases was deemed as not being worth it.
In my speculation, the lack of
fputsoptimization would fall in that not being worth it category.This old gcc
printfoptimization document sheds some light on these optimizations, I doubt that it would much different today.Specifically: