Bug in PTX ISA (carry propagation)?

311 views Asked by At

Is there a bug in Cuda? I have run the following code on my GTX580 and r1 is zero at the end. I expect that it is one due to carry propagation? I have tested the code with Cuda Toolkit 4.2.9 and 5.5 and use "nvcc -arch=sm_20 bug.cu -o bug && ./bug" to compile and run it.

#include <stdio.h>
#include <cuda.h>

__global__ void bug()
{
  unsigned int r1 = 0;
  unsigned int r2 = 0;

  asm( "\n\t"
       "sub.cc.u32 %0, 0, 1;\n\t"
       "addc.cc.u32 %1, 0, 0;\n\t"
     : "=r"(r1), "=r"(r2) );

  printf("r1 >> %04X\n", r1);
  printf("r2 >> %04X\n", r2);

}

int main(void)
{
  float *a_d;
  cudaMalloc((void **) &a_d, 1);

  bug <<< 1,1 >>> ();

  cudaFree(a_d);
}

Output
r1 >> FFFFFFFF
r2 >> 0000
2

There are 2 answers

1
user4811 On

I think I have found an explanation. There is a note in the PTX manual which says for the sub.cc instruction: "Behavior is the same for unsigned and signed integers."

0
Robert Crovella On

I believe you're making some assumptions about the CC.CF flag referenced in the PTX ISA documentation that may not be valid.

Note that the definition of specific states (e.g. 0 or 1) of this bit are never given that I can see. Furthermore, I don't find any mapping between the definition of "carry-in/carry-out" and "borrow-in/borrow-out"

Stated another way, I think you are assuming that a "borrow" status in this flag is identical to a "carry" status. In other words, you are assuming something like:

CF:  
0    =  (NO CARRY) or (NO BORROW)
1    =  (CARRY) or (BORROW)

But such a truth table or mapping is never given. Furthermore the manual states:

The condition code register ... is mainly intended for use in straight-line code sequences for computing extended-precision integer addition, subtraction, and multiplication.

I don't think your code satisfies the intent, nor do I think the above assumption of truth table for CC.CF is valid.

In fact what I think is happening is a truth table like this:

CF:  
0    =  (CARRY) or (NO BORROW)
1    =  (NO CARRY) or (BORROW)

(the 0 and 1 here are arbitrary; that is also not defined in the manual.)

All examples of code I have tried (about 6 cases, including yours) have fit the definition I have given above.

Having said this, I would think it unwise to depend on this, as it is mostly undocumented. A safe rule for computer architecture is that undocumented behavior may change in the future.