Can this Linux / 32bit x86 assembly "Hello, World" be made smaller still?

2.4k views Asked by At

The following 32bit x86 Linux program prints a string of arbitrary length (as long as a program can be, anyway) and does exit(0) afterwards:

.global _start             ; notice on entry here, all regs but %esp are zero
_start:
    call  .L0              ; offset == strlen, provided by your assembler
.byte 'H','e','l','l','o',',',' ','W','o','r','l','d'
.L0:
    pop   %ecx             ; ret addr is starting addr of string
    mov   -4(%ecx),%edx    ; argument to `call`, 4 bytes: strlen
    inc   %ebx             ; stdout == 1
    movb  $4, %al          ; SYS_write == 4
    int   $0x80
    xchg  %eax,%ebp        ; %ebp is still zero
    xchg  %eax,%ebx        ; SYS_exit == 1, return value == 0
    int   $0x80

If one's willing to sacrifice the position-independence (instead, force the linker to insert the string address), and not care about the program returning zero, one can get it down to:

.global _start
_start:
    movb  $4, %al
    inc   %ebx
    mov   $.L0, %ecx       ; this address is calculated when linking
    movb  $.Lend-.L0, %dl  ; strlen, calculated by assembler
    int   $0x80
    xchg  %eax,%ebx
    int   %0x80
.L0:
.byte 'H','e','l','l','o',',',' ','W','o','r','l','d'
.Lend:

Both of these can be assembled/linked via as --32 -o x.o x.S; ld -s -m elf_i386 x.o, and run just fine. The second one is 26 Bytes of code. If you permit a crash after printing Hello, World then leave the last two instructions out, 23 Bytes. That's as low as I could go.

Question that's always bugged me, is it possible to squeeze a few more bytes off this ? Pure speculation of mine gives these possible leads:

  • Somehow use parts of the 'Hello, World' itself as code ?
  • Anyone knows a usable syscall easter egg ?
  • trick the linker into making the entrypoint a 16-bit address so that movw $.L0, %cx could be used (saves one byte) ?
  • Do an 8-bit offset jmp to a place that's known (or created via assembler / linker invocation magic) to contain the necessary instructions for the exit(...) syscall, saving one byte over the xchg; int sequence ?

Or else, can it be proven that this actually is the smallest well-behaved (no crash / return code zero) Linux/x86 "Hello, World" ?

Edit

To clarify, the question is not about minimizing the size of the ELF executable; techniques for that are long-known. I'm explicitly inquiring about the size of a Linux 32bit x86 assembly program that performs the equivalent of what the compiled code for:

int main(int argc, char **argv)
{
    puts("Hello, World");
    exit(0); /* or whatever code */
}

would do.
In fact, I'll be happy about anything that does not require manual editing of the ELF headers. If you find a way to e.g. stuff the "Hello, World" into some ELF object and referencing that from the assembly source, using only the assembler / linker command line and/or a mapfile input, I'd consider it valid enough, even though that increases the size of the ELF executable. I just want to know if the instruction sequence to print "Hello, World" and exit() afterwards can be shrunk still.
The question is about code size, not executable size.

2

There are 2 answers

2
Igor Skochinsky On

It's been done back in 1999. Have a look at this page (spoiler: the end result is a 45-byte ELF file). Make sure to read the postscript too.

2
ephemient On

A straightforward translation of the C code, using libc, results in 16 bytes of instructions:

.S:
    .asciz "Hello, World"
.globl main
main:
    push $.S
    call puts
    add $4, %esp
    xor %eax, %eax
    ret

If you use x86-64 instead if x86-32, the calling convention passes arguments in registers so we can skip the stack manipulation, and

main:
    mov $.S, %rdi
    call puts
    xor %eax, %eax
    ret

is only 15 bytes of code.