How do you understand 'REX.W + B8+ rd io' form for x86-64 assembly?

Question

How do you understand 'REX.W + B8+ rd io' form for x86-64 assembly?

904 views Asked by Happy Jerry At 06 April 2022 at 22:29

I was originally trying to generate the bytes for an immediate move into a 64 bit register.The specific operation I wanted was

mov rdi, 0x1337

Using https://www.felixcloutier.com/x86/mov, the only non-sign extended instructions I saw was

REX.W + B8+ rd io

This confused me so I created a small assembly program to see what the assembler would generate

          global    _start

          section   .text
_start:   
          mov       rdi, 0x1337 
          syscall                           
          mov       rax, 60                 
          xor       rdi, rdi                
          syscall

I had to turn off optimizations so that there would be a move into a 64-bit register. So I compiled with nasm -felf64 -O0 main.asm && ld main.o and generated a a.out. I look at the objdump -M intel -d ./a.out and this line

48 bf 37 13 00 00 00    movabs rdi,0x1337

That line looks nothing like

REX.W + B8+ rd io

to me? Additionally, after some research, I saw that the command is suppose to be 10 bytes. How do you get that from REX.W + B8+ rd io?

Original Q&A

There are 1 answers

**harold** · Accepted Answer · 2022-04-06T23:07:07+00:00

B8+ rd means the operand (a register) is encoded in the low 3 bits of the opcode, not in a ModR/M byte.

From the Intel Software Developer's Manual,

+rb, +rw, +rd, +ro — Indicated the lower 3 bits of the opcode byte is used to encode the register operand without a modR/M byte. The instruction lists the corresponding hexadecimal value of the opcode byte with low 3 bits as 000b. In non-64-bit mode, a register code, from 0 through 7, is added to the hexadecimal value of the opcode byte. In 64-bit mode, indicates the four bit field of REX.b and opcode[2:0] field encodes the register operand of the instruction. “+ro” is applicable only in 64-bit mode.

It looks like Intel wanted to use +ro for 64-bit operands encoded in that way, but then didn't actually do that. Not just in the mov lemma, but anywhere, as far as I could find. For example 64-bit push and pop could have had + ro, but they also have + rd. And "Indicated" is likely a typo, the rest of the text uses the present tense.

The (e/r)di register is number 7, and B8 + 7 = BF, explaining the opcode.

io stands for a qword immediate (o for octo, as in 8 bytes, perhaps?).

The REX prefix (40 for the base prefix, +8 to set the W bit, optionally +1 to set the B bit to access R8..R15), the opcode, no ModR/M byte, and the 8-byte immediate, add up to 10 bytes.

TechQA.

How do you understand 'REX.W + B8+ rd io' form for x86-64 assembly?

There are 1 answers

Related Questions in ASSEMBLY

Related Questions in X86-64

Related Questions in NASM

Related Questions in MACHINE-CODE

Related Questions in INSTRUCTION-ENCODING

Popular Questions

Popular Tags

Trending Questions