Why does jmpq of x86-64 only need 32-bit length address?

5.9k views Asked by At

As I use objdump -D to disassemble a binary, the typical code of jmpq is like e9 7f fe ff ff, which is used for representing a negative offset. However, the address of x86-64 is 64(48)-bit (to my knowledge), so how can this 32-bit address 7f fe ff ff represent the negative offset of 64-bit absolute address?

Additionally, are there any other instructions like jmp and jmpq, but have 64-bit address displacement? How can I find the instructions in Intel's or AMD's manual (I searched for jmpq but found nothing)?


As I searched, it seems to be called RIP-relative addressing. And it seems that not all instructions do this. Is there 64-bit relative addressing? If it is an indirect jump, the 64-bit absolute address would be in a register or memory, right?

3

There are 3 answers

2
Ira Baxter On BEST ANSWER

As others have noted, the "jmp relative" instruction for x86-64 is limited to a 32 bit signed displacement, used as a relative offset with respect to the program counter.

OP asked why there is no relative jump with a 64 bit offset. I can't speak for the designers at Intel, but it seems pretty clear that this instruction would simply not be very useful, especially with the availability of the 32-bit relative jmp. The only time it would be needed is when your program was 2+ gigabytes in size, so that the 32 bit relative jmp could not reach all of it from any point within it. Seen any 2Gb object files recently? So the apparent utility for such instructions seems really small.

Mostly when programs get really large, they start to be broken into more manageable elements that can evolve at different rates. (DLLs are an example of this). Interfacing between such elements is done by more arcane means (jump vectors, etc) to ensure that the interfaces stay constant in the face of evolution. An extremely-long jmp relative could be used to reach from an application to an entry point in another module, but the actual cost of loading an absolute address into a register and doing an register-indirect call, is small enough in practice that it isn't worth optimizing. And modern CPU design is all about optimizing where you put your transistors to maximize performance.

Just to be complete, the x86 (many flavors) have very short jmp relative instructions (8 bit signed offset), too. In practice, even the 32 bit jmp relative instructions are rarely needed, especially if you have a good code generator that can rearrange code blocks. Intel arguably could have left these out for the same reason; I suspect their utility is marginally high enough to justify the transistors.

The question of "big literal operands" shows up in funny ways in many architectures. If you examine the distribution of literal values in code, you'll discover that small values (0,1, ascii character codes) cover a pretty good percentage; almost everything else are memory addresses. So you kind of don't need "big literal values" in programs but you do have to handle memory addresses somehow. The Sparc chip famously has "load literal value low into register" (meaning "small constants") and less often used "load literal value high" (to fill upper bits in a register) used as a second instruction to make big constants, and used less often. This keeps the code small, except when you need a big constant; small code means higher effective instruction fetch rates and that contributes to performance.

7
Craig S. Anderson On

The E9 opcode in 64 bit mode take a 32 bit sign displacement sign extended to 64 bits:

E9 cd -> JMP rel32 ->Jump near, relative, RIP = RIP + 32-bit displacement sign extended to 64-bits

The FF opcode can be used to jump to a 64 bit address:

FF /4 -> JMP r/m64 -> Jump near, absolute indirect, RIP = 64-Bit offset from register or memory

Quotes taken from the Intel instruction set manual entry for JMP.

2
Z boson On

The following applies to 64-bit mode.

JMP can be done either directly or indirectly.

Direct jumps are relative to the instruction pointer RIP. There are two types of direct jumps: short and near.

  • Short jumps use Opcode EB followed by a 8-bit signed displacement and are therefore RIP –128 to +127 bytes.
  • Near jumps use Opcode E9 and are followed by a 32-bit signed displacement and are therefore RIP -2147483648 to +2147483647.

Your assembler will use short jumps when it can since they only need two bytes. But in NASM you can force a near jump using the near keyword e.g.

test:
    jmp test         ; eb fb 
    jmp near test    ; e9 f6 ff ff ff

64-bit addressing modes are: RIP-relative, 32-bit absolute, 64-bit absolute, and relative to a base pointer. The JMP instruction can use all of these except 64-bit absolute. Indirect jumps use Opcode FF. Some examples using the NASM syntax:

jmp [a]                ;ff 24 25 00 00 00 00 - 32-bit absolute 
jmp [rel a]            ;ff 25 e7 ff ff ff    - RIP + 32-bit displacement
jmp [rdi]              ;ff 27                - base pointer
jmp [rdi +4*rsi + a]   ;ff a4 b7 00 00 00 00 - base pointer +4*index + displacement

On OSX, however, 32-bit absolute addressing is not possible because the image base is greater than 2^32.

The only instruction that can use 64-bit absolute addressing is mov and then either the source or destination must be AL, AX, EAX or RAX. E.g in NASM

mov rax, [qword a]