Something has got me confused in x86 assembly for a while, it's how/when can NASM infer the size of the operation, here's an example:
mov ebx, [eax]
Here we are moving the 4 bytes stored at the address held in eax into ebx. The size of the operation is inferred as 4 bytes because the register is 32 bits.
However, this operation doesn't get inferred and throws a compile error:
mov [eax], 123456
Of course the solution is this:
mov dword [eax], 123456
Which will move the 32 representation of the number 123456 into the bytes stored at the address held at eax.
But this confuses me, surely it can see eax is 32 bit, so shouldn't it assume I want to store it as a 32 bit value without me having to specify dword after the mov?
Surely if I wanted to put the 16 bit representation of 12345 (smaller number to fit in 16 bits) into eax I would do this:
mov ax, 12345
The operand-size would be ambiguous (and so must be specified) for any instruction with a memory destination and an immediate source. (Neither operand actually being a register, even if using one or more in an addressing mode.)
Address-size and operand-size are separate attributes of an instruction.
Quoting what you said in a comment on another answer, since I think this gets at the core of your confusion:
The BYTE/WORD/DWORD [PTR] annotation is not about the size of the memory address; it's about the size of the variable in memory at that address. Assuming flat 32-bit addressing, addresses are always four bytes long, and therefore must go in Exx registers. So, when the source operand is an immediate value, the dword (or whatever) annotation on the destination operand is the only way the assembler can know whether it's supposed to modify 1, 2, or 4 bytes of RAM.
Perhaps it will help if I demonstrate the effect of these annotations on machine code:
(I've adjusted the spacing a bit compared to how
objdump
actually prints it.)Take note of two things: (1) the three different operand prefixes produce three different machine instructions, and (2) using a different prefix changes the length of the source operand as emitted into the machine code.