I'm quite new to assembly and low-level-coding, but I found nothing worth on Google, so maybe someone could explain why a simple Hello World ARM64 script takes 17KB on my Mac.
I'm using an Apple Mac Mini M1 and I started learning some assembly in an OS-Dev Tutorial. There, NASM x86 Assembly was used and it was very simple, the code was compiled just with a normal NASM without any cross-compiler stuff. The tutorial started with a Hello World Program that uses BIOS interrupts and the output file was veeeery small and basically the Assembly instructions represented as bytes, just like how you would expect it.
Now, I tried coding with assembly on my Mac, and after finding some useful instructions about ARM64 and M1 assembly in the deep dark of the web, I wrote the hello world programm you see below. I compiled it with the as
command, and the output file was very weird.
It contained some strings like __TEXT
, ltmp0
and the names of the breakpoints in the assembly source code. What are they used for? And why NASM-compiled programs don't need that? Wouldn't it be enough to just move the arguments into the registers and call the 0x80 interrupt?
But this isn't even the final output file, it's object code so I had to link it, and the final program (that really printed "Hello World!" to my Terminal) has a size of a whopping 17 kilobytes. Just for setting ax
, bx
, cx
and dx
, calling an interrupt and storing the Hello World string. What is that stuff like __mh_execute_header
, __unwind_info
, /usr/lib/libSystem.B.dylib
used for? And why does about 15KB of the file consist of 0x00 bytes?
I know that some stuff depends on my OS and system libs or whatever, but does my MacOS really needs all the 17KB and the strings to print "Hello World!" to the terminal? And why does as
also does that crazy stuff on an x86 linux, while nasm on MacOS and Linux gives just simple code? Is this really necessary for my code to run, or is there any way to avoid it?
Thanks for answers!
The ARM assembly code:
.global _main
.align 2
_main:
mov X0, #1
adr X1, hello
mov X2, #13
mov X16, #4
svc #0x80
mov X16, #1
mov X0, #0
svc #0x80
hello: .ascii "Hello World!\n"