In our lecture we have recently taken a look at the c99 standard on pointer equality(6.5.9.6) and applied it to nested arrays. There it states that pointers are only guaranteed to be equal if "one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space".
The professor then explained this is the reason that the array access a[0][19] is technically undefined for a nested array with dimensions 4*5. Is this true? If so, why are negative indices defined then e.g. a[1][-1]?
Neither
a[0][19]
nora[1][-1]
has behavior defined by the C standard.C 2018 6.5.2/1 2 tells us that array subscripting is defined in terms of pointer arithmetic:
Thus
a[0][19]
is identical to*(a[0] + 19)
(where some parentheses have been omitted because they are unnecessary), anda[1][-1]
is identical to*(a[1] + -1)
.In
a[0] + 19
, anda[1] + -1
,a[0]
anda[1]
are arrays. In these expressions, they are automatically converted to pointers to their first elements, per C 2018 6.3.2.1 3. So these expressions are equivalent top + 19
andq + -1
, wherep
andq
are the addresses of those first elements,&a[0][0]
anda[1][0]
, respectively.C 2018 6.5.6 8 defines pointer arithmetic:
So
p + 19
would point to element 19 ofa[0]
if it existed. Buta[0]
is an array of 5 elements, so element 19 does not exist, and therefore the behavior ofp + 19
is not defined by the standard.Similarly,
q + -1
would point to element -1 ofa[1]
, but element -1 does not exist, and therefore the behavior ofq + -1
is not defined by the standard.The fact that these arrays are contained within a larger array, and that we know the memory layout of all elements in this larger array, does not matter. The C standard does not define the behavior in terms of the larger memory layout; it specifies behavior based on the specific array in which pointer arithmetic is being evaluated. A C implementation would be free to make this arithmetic work like simple address arithmetic and to define the behavior if it desired, but it also permitted not to do this. Compiler optimization has become more sophisticated and aggressive over the years, and it may transform these expressions based on the C standard’s rules about specific array arithmetic without regard to the memory layout, and this can cause the expressions to fail (not behave as they would with simple address arithmetic).