Range of immediate values in ARMv8 A64 assembly

9.3k views Asked by At

My understanding is that immediate parameters in ARMv8 A64 assembly can be 12 bits long. If that is the case, why does this line of assembly code:

AND X12, X10, 0xFEF 

Produce this error (when compiled with gcc)

Error:  immediate out of range at operand 3 -- `AND X12, X10, 0xFEF'

Interestingly enough, this line of assembly code compiles fine:

ADD X12, X10, 0xFEF

I'm using aarch64-linux-gnu-gcc (Linaro GCC 2014.11) 4.9.3 (prerelease)

3

There are 3 answers

5
Notlikethat On BEST ANSWER

Unlike A32's "flexible second operand", there is no common immediate format in A64. For immediate-operand data-processing instructions (ignoring the boring and straightforward ones like shifts),

  • Arithmetic instructions (add{s}, sub{s}, cmp, cmn) take a 12-bit unsigned immediate with an optional 12-bit left shift.
  • Move instructions (movz, movn, movk) take a 16-bit immediate optionally shifted to any 16-bit-aligned position within the register.
  • Address calculations (adr, adrp) take a 21-bit signed immediate, although there's no actual syntax to specify it directly - to do so you'd have to resort to assembler expression trickery to generate an appropriate "label".
  • Logical instructions (and{s}, orr, eor, tst) take a "bitmask immediate", which I'm not sure I can even explain, so I'll just quote the mind-bogglingly complicated definition:

Such an immediate is a 32-bit or 64-bit pattern viewed as a vector of identical elements of size e = 2, 4, 8, 16, 32, or 64 bits. Each element contains the same sub-pattern: a single run of 1 to e-1 non-zero bits, rotated by 0 to e-1 bits. This mechanism can generate 5,334 unique 64-bit patterns (as 2,667 pairs of pattern and their bitwise inverse).

2
EvgEnZh On

An alternative explanation of bitmask immediates, now that is is morning and I finally understood the "mind-boggingly complicated" definition. (See Notlikethat's answer.) Maybe it would be easier for some to understand.

It is X>0 consecutive zeros followed by Y>0 consecutive ones, where X+Y is a power of 2, repeated to fill the whole argument and then rotated arbitrarily.

Also note that optional shifts in other immediate formats are by exact amounts of bits, not "up to". That is, the 16-bit immediates can be shifted by 0, 16, 32 or 48 bits exactly, while 12-bit immediates only by 0 or 12 bits.

1
Yan On

Here is a piece of code to dump all legal bitmask immediates following the mechanism quoted in Notlikethat's answer. Hope it helps to understand how the rule for generating bitmask immediates work.

#include <stdio.h>
#include <stdint.h>

// Dumps all legal bitmask immediates for ARM64
// Total number of unique 64-bit patterns: 
//   1*2 + 3*4 + 7*8 + 15*16 + 31*32 + 63*64 = 5334

const char *uint64_to_binary(uint64_t x) {
  static char b[65];
  unsigned i;
  for (i = 0; i < 64; i++, x <<= 1)
    b[i] = (0x8000000000000000ULL & x)? '1' : '0';
  b[64] = '\0';
  return b;
}

int main() {
  uint64_t result;
  unsigned size, length, rotation, e;
  for (size = 2; size <= 64; size *= 2)
    for (length = 1; length < size; ++length) {
      result = 0xffffffffffffffffULL >> (64 - length);
      for (e = size; e < 64; e *= 2)
        result |= result << e;
      for (rotation = 0; rotation < size; ++rotation) {
        printf("0x%016llx %s (size=%u, length=%u, rotation=%u)\n",
            (unsigned long long)result, uint64_to_binary(result),
            size, length, rotation);
        result = (result >> 63) | (result << 1);
      }
    }
  return 0;
}