how can a label in assembly know its runtime address?

1.5k views Asked by At

I am learning assembly (z80 and x86) and am now coming to grips with building binaries using an assembler.

How is it possible to use labels with absolute (as opposed to relative) addresses?

From what I understand, an assembler will translate a label to a memory address on compilation, but how is it possible for the assembler to know which address a label will reside in at runtime?

It seems simple enough for a z80 bare metal system, as you can load a program into a specific memory address and send an RST signal that will point the program counter to 0000h. What happens when there is an operating system running?

Wouldn't code executed within an operating system not know its starting address (and thereby have no way of using non-relative opcodes like call and ret on labels)?

4

There are 4 answers

0
Alexey Frunze On BEST ANSWER

If machine code is not position-independent, there are two common strategies:

  • include additional information (AKA relocation information) in the executable file that would tell the OS where the absolute addresses are that need to be adjusted

  • just load the executable where it wants, meaning that you may need to first evict another one OR you need to provide individual address spaces for every one, so there's no fighting for the right spot in the first place

0
Tommy On

It might be worth reading up on CP/M's solution, which was simply: the binary is always loaded at a fixed address, the OS entry point is always at another fixed address. This was fairly typical on 8-bit machines, even those with formal OSs, and was carried through into MS-DOS. It's also technically feasible with multitasking OSs that utilise an MMU as each process gets its own address space, so each binary can think it was loaded at the same place.

The generation in between use relocatable code. Either it's location independent because the CPU supports that easily (as per Classic Mac OS and the 68000's relative-to-PC addressing) or, in effect, the second pass of a classic two-pass assembler occurs when the binary is loaded. So the binary is compiled code with placeholders for all absolute addresses and a list of where those placeholders are so that they can be replaced once actual addresses are known.

The only issue with that is that it prevents speedy virtual memory. With the non-MMU Mac OS approach, a program is compiled as 16kb chunks, with jumps within each chunk occurring directly and remote jumps going through a pager. If the target chunk is loaded then off it goes, and if not then it is loaded and then the jump occurs. Such loading on demand is prohibitive if addresses need to be calculated and filled in on every load.

2
user3344003 On

Assemblers use offsets.

LABEL
     . . . . . 

     JMP LABEL // Knows the number of bytes to label. SO label can be anywhere.
0
Van Uitkon On

I guess, your program will be shorter than 64 kb. In this case, the program has only to know the OFFSET of the label (known as Near-Jump). The operating system starts the program everytime at the same OFFSET, but at another segment. condintional jumps and "jmp short" use only the difference between the jmp command and the label. In some special cases, for example if a procedure is stored to the stack before executing, the compiler inserts a code that changes the arguments of the jmp command.