Aarch64 SUB etc instructions defined as add with carry operation

206 views Asked by At

I'm having trouble understanding some instructions like the sub instruction are defined according to the manual as an AddWithCarry operation where the carry is set to a hard coded value of 1:

bits(datasize) result;
bits(datasize) operand1 = if n == 31 then SP[]<datasize-1:0> else X[n, datasize];
bits(datasize) operand2;
operand2 = NOT(imm);
(result, -) = AddWithCarry(operand1, operand2, '1');
if d == 31 then
    SP[] = ZeroExtend(result, 64);
else
    X[d, datasize] = result;

The AddWithCarry operation is defined as such:

(bits(N), bits(4)) AddWithCarry(bits(N) x, bits(N) y, bit carry_in)
    integer unsigned_sum = UInt(x) + UInt(y) + UInt(carry_in);
    integer signed_sum = SInt(x) + SInt(y) + UInt(carry_in);
    bits(N) result = unsigned_sum<N-1:0>; // same value as signed_sum<N-1:0>
    bit n = result<N-1>;
    bit z = if IsZero(result) then '1' else '0';
    bit c = if UInt(result) == unsigned_sum then '0' else '1';
    bit v = if SInt(result) == signed_sum then '0' else '1';
    return (result, n:z:c:v);

Wouldn't passing 1 as carry all the time and looking by the definition of AddWithCarry, make the subtraction operation subtract 1 from all operations as-well?

I know when we write stuff like:

sub sp, sp, #0x20

We actually only subtract 32 bytes from sp so what's up with the carry bit in this operation?

2

There are 2 answers

0
Nate Eldredge On

This also confused me at first. The key is the line before: it isn't operand2 = -imm as you might expect, but operand2 = NOT(imm), i.e. bitwise not (one's complement). In two's complement arithmetic, you can readily check that NOT(imm) = -imm - 1. So the carry being set to 1 effectively computes x + (-imm - 1) + 1 which is indeed x - imm.

It's sort of an artifact of the way their pseudocode language is set up: their integer types are pure mathematical integers able to represent any number whatsoever, and arithmetic operators are defined only on such types. But here, the operands are of type bits(n), simply a bitstring, for which they define logical operators only. So writing operand2 = -imm wouldn't be well-formed. They'd have to say something like operand2 = (-SInt(imm))<n-1:0> which would be even more confusing.

It might also reflect a way that a simple ALU could implement addition instructions. Rather than needing separate units for ADD, SUB, ADC, etc, you just need one that does add-with-carry. So ADC would connect the carry input of this unit to the actual C bit of the NZCV register; ADD would connect it to ground. SUB would pass the first input through an inverter, and connect the carry input to VDD. SBC does the same with the inverter and connects the carry input to the C flag. (Note this results in subtraction treating the C flag like a true carry, as opposed to x86 where SUB inverts the sense of the carry flag to make it behave like a borrow.)

1
Peter Cordes On

You've perhaps seen an explanation that most ALUs do a-b as a + (~b + 1), because they already have a binary adder, and bitwise NOT is very cheap in hardware. -b = ~b + 1 (two's complement identities). How does the CPU do subtraction? is a good example, along with Wikipedia's binary Adder–subtractor article.

What some of those explanations omit is that the + 1 part is done with carry-in to the binary adder (like adc) so it's still only one addition, which matters for both performance and for getting useful flags results for signed oVerflow and unsigned Carry-out.

That's what's going on here. Note the operand2 = NOT(imm); rather than NEG or -imm. The extra +1 carry-in is required to get the right answer.

In ARM and AARch64, the Carry flag output from subtraction is the raw output of the ALU's adder-subtractor which does subtraction this way. This makes it a not-borrow flag, false after x - y if x < y.


Some other ISAs, notably x86, invert the carry flag output from the ALU so it's a borrow flag. That's why x86's subtraction version of adc is sbb (subtract-with-borrow which also has to invert CF on input to get a carry-in of 1 for the no-borrow case), vs. ARM/AArch64's sbc (subtract-with-carry which feeds in C directly to get a 1 for the no-borrow, otherwise 0 to make the output lower by 1 when there's a borrow, normally from the less-significant chunk of a bigint.)

Either way, x86 sbb / ARM sbc are a way for software to chain a 64-bit ALU into one wider adder/subtractor, exactly like normal carry propagation between full-adders in a ripple-carry adder. (Inside each 64-bit chunk, the ALU is probably doing fun stuff like carry-lookahead or carry-select to keep latency down to few enough gate delays to fit in one clock cycle, but it's generally not worth trying to do that in software.)


Also related: