How are the two flags used to correctly calculate the answer when two numbers that are multiplied overflow the register?
E.g. if al
holds 0xff
and is multiplied by 0x2
, causing an overflow into ax
, how do the flags help with this?
Multiplication on x86/x64 never overflows when using the one operand form.
This is because mul and its sibling imul produce an output twice as wide as their operands1.
In your example, multiplying by al
produces an output in ax
and no overflow is generated.
The CF
and OF
are set when the result cannot fit in the operands size.
This can be used to perform multiplication with saturation, for example:
;Unsigned
mul ebx
sbb edx, edx ;EDX = CF replicated along all the 32 bits
or eax, edx ;EAX = 0ff..ffh if overflow, EAX*EBX otherwise
;Signed (perhaps not the most efficient way)
imul ebx
cmovc eax, 7fffffffh ;Signed positive max if overflow. (CMOV-immediate doesn't really exist, but imagine register sources)
cmovnc edx, 0 ; don't modify EAX for the non-overflow case.
sar edx, 31 ; EDX = all-ones if overflow && negative
xor eax, edx ; if negative && overflow
; flip 7fffffff (INT_MIN) to 80000000 (INT_MIN)
; else xor with 0 is a no-op
(Current Rust compilers also implement a.saturating_mul(b)
for i32 and i64 using FLAGS from imul, but with different setup: https://rust.godbolt.org/z/ab3jMjzbv)
However to implement a multi-precision multiplication, say 64x64-bit, those flags are not needed, in fact, denoting with 232 with k we have:
(a·k+b) × (c·k+d) = a·c·k2 + (a·d+b·c)·k + b·d
where the 32-bit products produce 64-bit results that are added as below
.----.----.
| b·d |
'----'----'
+
.----.----.
| a·d+b·c |
'----'----'
+
.----.----.
| a·c |
'----'----'
=
.----.----.----.----.
| 128-bit result |
'----'----'----'----'
1 And this suffices to prevent overflow.
The best way to answer questions like this is to read the manual.
Now I am going back to my paper copy of an 80188/186 (I don't know where my 8088/8086 programming manual is). But even back then, it is like folks are saying. The text says
And that is all fine: you can't overflow, as folks are saying, but folks don't normally write high-level language code that uses a result twice the size of the operands. It goes further to say:
So there you go: if you are doing an 8-bit operation and the result does not fit in 8 bits (0xFF * 2 = 0x1FE), then both CF and OF are set. At least, they were 20 years ago. It should be quite trivial to do an experiment to see what your current x86 does. You could have crafted this experiment before posting here, walk through every combination of operands and see what the flags do for each.