MinGW's ld cannot perform PE operations on non PE output file

3.8k views Asked by At

I know there are some other similar questions about this out there, be it StackOverflow or not. I've researched a lot for this, and still didn't find a single solution. I'm doing an operative system as a side project. I've been doing all in Assembly, but now I wanna join C code. To test, I made this assembly code file (called test.asm):

[BITS 32]

GLOBAL _a

SECTION .text

_a:
    jmp $

Then I made this C file (called main.c):

extern void a(void);
int main(void)
{
    a();
}

To link, I used this file (called make.bat):

"C:\minGW\bin\gcc.exe"  -ffreestanding -c -o c.o main.c
nasm -f coff -o asm.o test.asm
"C:\minGW\bin\ld.exe" -Ttext 0x100000 --oformat binary -o out.bin c.o asm.o

pause

I've been researching for ages, and I'm still struggling to find an answer. I hope this won't be flagged as duplicate. I acknowledge about the existence of similar questions, but all have different answers, and none work for me.

Question: What am I doing wrong?

1

There are 1 answers

5
Martin Rosenau On BEST ANSWER

Old MinGW versions had the problem that "ld" was not able to create non-PE files at all.

Maybe current versions have the same problem.

The work-around was creating a PE file with "ld" and then to transform the PE file to binary, HEX or S19 using "objcopy".

--- EDIT ---

Thinking about the question again I see two problems:

As I already said some versions of "ld" have problems creating "binary" output (instead of "PE", "ELF" or whatever format is used).

Instead of:

ld.exe --oformat binary -o file.bin c.o asm.o

You should use the following sequence to create the binary file:

ld.exe -o file.tmp c.o asm.o
objcopy -O binary file.tmp file.bin

This will create an ".exe" file named "binary.tmp"; then "objcopy" will create the raw data from the ".exe" file.

The second problem is the linking itself:

"ld" assumes a ".exe"-like file format - even if the output file is a binary file. This means that ...

  • ... you cannot even be sure if the object code of "main.o" is really placed at the first address of the resulting object code. "ld" would also be allowed to put the code of "a()" before "main()" or even put "internal" code before "a()" and "main()".
  • ... addressing works a bit differently which means that a lot of padding bytes will be created (maybe at the start of the file!) if you do something wrong.

The only possibility I see is to create a "linker script" (sometimes called "linker command file") and to create a special section in the assembler code (because I normally use another assembler than "nasm" I do not know if the syntax here is correct):

[BITS 32]
GLOBAL _a
SECTION .entry
    jmp _main
SECTION .text
_a:
    jmp $

In the linker script you can specify which sections appear in which order. Specify that ".entry" is the first section of the file so you can be sure it is the first instruction of the file.

In the linker script you may also say that multiple sections (e.g. ".entry", ".text" and ".data") should be combined into a single section. This is useful because sections are normally 0x1000-byte-aligned in PE files! If you do not combine multiple sections into one you'll get a lot of stub bytes between the sections!

Unfortunately I'm not the expert for linker scripts so I cannot help you too much with that.

Using "-Ttext" is also problematic:

In PE files the actual address of a section is calculated as "image base" + "relative address". The "-Ttext" argument will influence the "relative address" only. Because the "relative address" of the first section is typically fixed to 0x1000 in Windows a "-Ttext 0x2000" would do nothing but filling 0x1000 stub bytes at the start of the first section. However you do not influence the start address of ".text" at all - you only fill stub bytes at the start of the ".text" section so that the first useful byte is located at 0x2000. (Maybe some "ld" versions behave differently.)

If you wish that the first section of your file is located at address 0x100000 you should use the equivalent of "-Ttext 0x1000" in the linker script (-Ttext is not used if a linker script is used) and define the "image base" to 0xFF000:

ld.exe -T linkerScript.ld --image-base 0xFF000 -o binary.tmp a.o main.o

The memory address of the ".text" section will be 0xFF000 + 0x1000 = 0x100000.

(And the first byte of the binary file generated by "objcopy" will be the first byte of the first section - representing memory address 0x100000.)