In Programming from the Ground Up, in chapter 3 I read
The general form of memory address references is this:
ADDRESS_OR_OFFSET(%BASE_OR_OFFSET, %INDEX, MULTIPLIER)All fields are optional. To calculate the address, simply perform the following calculation:
FINAL ADDRESS = ADDRESS_OR_OFFSET + %BASE_OR_OFFSET + MULTIPLIER * %INDEX
ADDRESS_OR_OFFSETandMULTIPLIERmust both be constants, while the other two must be registers. If one of the pieces is left out, it is just substituted with zero in the equation.
Now, I assume that substituted with zero is a typo, because if MULTIPLIER's default was 0, then the value of %INDEX would be irrelevant, as the product would always be zero anyway (indeed). I guess 0 is default for the other 3?
Nonetheless, what confuses me the most is that form the description above I understand that parenthesis and commas have the function of determining which parts of what we write map to the 4 "operands" of the addressing.
But then, in the following chapter I read
For example, the following code moves whatever is at the top of the stack into
%eax:movl (%esp), %eaxIf we were to just do
movl %esp, %eax
%eaxwould just hold the pointer to the top of the stack rather than the value at the top.
But I don't understand why. I mean,
given the
FINAL ADDRESSexpression above, I would say that- if we put
%espin parenthesis, it will play the role of%BASE_OR_OFFSET, withADDRESS_OR_OFFSETand%INDEXdefaulting to 0 andMULTIPLIERto 1, - if we put
%espnot in parenthesis, it will play the role ofADDRESS_OR_OFFSET, with%BASE_OR_OFFSETand%INDEXdefaulting to 0 andMULTIPLIERto 1,
and the sum would still be the same.
- if we put
Furthermore, how is
%espconstant?- Maybe I'm making the mistake of thinking that it is not constant because I think about the content of
%esp? - If that's the case, and
%espis constant becasue is the name of a physically fixed register, then what is a non constant, in this context?
- Maybe I'm making the mistake of thinking that it is not constant because I think about the content of
Correct, the default multiplier is
1.movl %esp, %eaxisn't using a memory addressing-mode at all. It's a register-direct operand, so it's syntactically different frommov symbol_name, %eax(a load from an absolute address).There's a register but it's not inside
()so thedisp(base,idx,scale)syntax doesn't apply.In machine code, the ModRM byte's 2-bit "mode" field uses
0b11to encode that it's a register operand instead of memory. (The other 3 encodings select memory with no displacement vs. disp8 vs. disp32: https://wiki.osdev.org/X86-64_Instruction_Encoding#ModR.2FM_and_SIB_bytes. And see also rbp not allowed as SIB base? for the fun special cases that allowdisp32with no registers, and to make the SIB byte optional to save machine-code size for simple addressing modes.) With ModR/M.mode =11, the field is just a simple register number. Similarly in assembly language, when you use a bare register name, you just get the register operand directly, not using it as an address to access memory.(I'm not sure this is a useful analogy, but I think the useful point is that a register operand is a different thing from a memory operand even in x86 machine code. They are qualitatively different and need to be distinguished.)
Also related:
1not0. The shift count is0in the machine code, but source-level asm syntax uses power-of-2 multipliers (in all x86 assembly syntaxes I've ever seen, including AT&T, all flavours of Intel, and Go's assembly dialect. It would of course be possible to invent a new syntax that used shift counts in the asm source.)