F2C translated code breaks when optimized by C++ compiler

423 views Asked by At

I have a c++ program with a method that looks something like this:

int myMethod(int* arr1, int* arr2, int* index)
{
    arr1--;        
    arr2--;
    int val = arr1[*index];
    int val2 = arr2[val];
    return doMoreThings(val);
}

With optimizations enabled (/O2) the first line where the first pointer is decremented is not executed. I'm debugging the optimized and non optimized builds side by side and the optimized build steps over the decrement, while the non-optimized program executes it. This produces observable difference in behavior when it later accesses the array with arr[*index].

UPDATE

As @stefaanv pointed out, the decrement may indeed be omitted by the compiler, if it instead changes to a decremented access index, which it appears to do. So the omitted decrement is not what is causing the difference in behavior. Instead, there is something in the use of the matrices that causes it.

Looking further I have narrowed it down to a method that contains nested loops performing a matrix multiplication. Part of the method looks like this: 3 arrays are involved: a, wa and t. In the beginning of the method the f2c translator uses a decrement so that an array that was 6 by 6 in fortran is a flat double[36] in c. But to be able to use the old indexing, it moves the array pointers back by the number of columns in the matrix.

Normally in this f2c translated program, flat arrays are passed as &someArray[1] and methods begin by decrementing each array by one. @Christoph pointed out that this should be valid, as the array is never decremented beyond its declared range.

In the case of this method, the arrays passed in are NOT passed as a pointer to an element further into the array &someArray[1] but here the arrays are local static arrays declared with fixed size e.g. mat[36] and passed directly to the multiplication method.

void test()
{
    double mat[36];
    ...
    mul(mat, .., ..)
}

void mul(double* a, double* t, double*wa, int M, int N, int K)
{
    // F2C array decrements.
    a -= (1+M); // E.g. decrement by seven for a[6x6]!
    t -= (1+N); 
    wa--;
    ...
    for (j = K; j <= M; ++j) {         
       for (i = 1; i <= N; ++i) {
          ii = K;
          wa[i] = 0.;          
          for (p = 1; p <= N; ++p) {
             wa[i] += t[p + i * t_dim1] * a[ii + j * a_dim1];
             ++ii;
          }
       }

       ii = K;      
       for (i = 1; i <= N; ++i) {
          a[ii + j * a_dim1] = wa[i];
          if (j > kn) {
             a[j + ii * a_dim1] = wa[i];
          }
          ++ii;
      }
    }
 }    

So the question is:

Does this mean that behavior is undefined and may break under optimization when you do what f2c has done here, i.e subtract 7 from the double[36] array pointer but then access all items in the array in the correct locations (offset 7)?

Edit: found this in the C FAQ, does this apply here?

Pointer arithmetic is defined only as long as the pointer points within the same allocated block of memory, or to the imaginary "terminating" element one past it; otherwise, the behavior is undefined, even if the pointer is not dereferenced. .... References: K&R2 Sec. 5.3 p. 100, Sec. 5.4 pp. 102-3, Sec. A7.7 pp. 205-6; ISO Sec. 6.3.6; Rationale Sec. 3.2.2.3.

UPDATE 2:

If I recompile with the multidimensional arrays using decremented indices rather than decremented pointers,

#define a_ref(a_1,a_2) a[(a_2)*a_dim1 + a_1 - 1 - a_dim1]

a_ref(1,2);

Then the method produces the same (expected) output regardless of optimizations. The single dimensional arrays which are only decremented by one appear to not create any issues.

I could change all the multi dimensional arrays in the program into using the above access method, but the single dim arrays are too many to change manually, so ideally I would like a solution that works for both.

New questions:

  • Is there an option for f2c to use this array access method rather than pointer fiddling? Seems like it would be a simple change in f2c, and produce well defined code, so you would think it was already an options.
  • Are there any other solutions for this problem (other than just skipping optimizations and hoping that the program is well behaved, despite relying on undefined behavior).
  • Is there something I can do in the c++ compiler? I compile with Microsoft C++ (2010), as a managed c++ project.
2

There are 2 answers

5
slartibartfast On

Moving a pointer-to-array out of the arrays specified range and then accessing through it (although the complete expression falls back into the range) is undefined behaviour AFAIK. Nevertheless on nearly every implementation this code should work like intended, so the question is, what are you looking at? Maybe there is an implicit predecrement in the assembler instruction taking the int from arr1[]? Only because you don't see the decrement in the debugger doesn't mean it is not there. Check if the right element is accessed by writing a distinguishing value to it.

4
stefaanv On

The optimizer should only make sure that there is no observable behavior change, so it can choose not to do the decrement and to access the data with a decremented index (the extra offset can be part of the opcode) because the function uses a copy of the pointer to the array. You don't tell us how the array is actually accessed and whether the optimizer is actually introducing an error, so I can only guess this.

But as slartibartfast already said: it is undefined behavior an the decrement should be replaced by int val = arr1[*index-1]; after checking that *index > 0