NASM and 8-bit memory offset confusion

1.5k views Asked by At

From the Intel Software Developer Manual (referred to as ISDM in this post) and the x86 Instruction Set Reference (which, I assume, is just a copy of the former), we know that the mov instruction can move data from eax/ax/al to a memory offset and vice versa.

For example, mov moffs8, al moves the contents of the al register to some 8-bit memory offset moffs8.

Now, what is moffs8? Quoting the ISDM (3.1.1.3):

moffs8, moffs16, moffs32, moffs64 — A simple memory variable (memory offset) of type byte, word, or doubleword used by some variants of the MOV instruction. The actual address is given by a simple offset relative to the segment base. No ModR/M byte is used in the instruction. The number shown with moffs indicates its size, which is determined by the address-size attribute of the instruction.

I emphasised the sentences saying that moffs8 is of type byte and is 8 bits in size.

I'm a beginner in assembly, so, immediately after having read this, I started playing around with the mov moffs8, al instruction using NASM. Here's the code I've written:

; File name: mov_8_bit_al.s
USE32

section .text
    mov BYTE [data], al

section .bss
    data resb 2

This is what nasm -f bin mov_8_bit_al.s produced (in hex):

A2 08 00 00 00

Here's how I understand this:

  • A2 is the opcode for MOV moffs8, AL
  • 08 is the memory offset itself, of size 1 byte
  • 00 00 00 is some garbage

It looks like 08 00 00 00 is the memory offset, but in this case, it's a moffs32, not moffs8! So, the CPU will read only one byte while executing A2, and treat 00 as an ADD instruction or something else, which was not intended.

At the moment, it seems to me that NASM is generating invalid byte code here, but I guess it's me who's misunderstood something... Maybe NASM doesn't follow IDSM? If so, its code wouldn't be executed properly on Intel CPUs, so it should be following it!

Can you please explain where I'm wrong?

1

There are 1 answers

4
harold On BEST ANSWER

The size suffix after moffs actually refers to the operand size, not the size of the address itself. This mirrors the meaning of the size suffix after r/m.

The manual actually says so in a note:

NOTES:
* The moffs8, moffs16, moffs32 and moffs64 operands specify a simple offset relative to the segment base, where 8, 16, 32 and 64 refer to the size of the data. The address-size attribute of the instruction determines the size of the offset, either 16, 32 or 64 bits.