CPUs with addressable GPR files, address of register variables, and aliasing between memory and registers

238 views Asked by At

Background

Some CPUs, such as the Atmel AVR, have a general purpose register file that is also addressable as part of main memory -- see Figure 7-2 in section 7.4 and the paragraph after the figure.

What was WG14 thinking?

Given this, why did the C committee choose to make

register int ri;
int* pi = &ri;

universally ill-formed, as per footnote 101 to N1124 section 6.7.1? Wouldn't undefined or implementation-defined behavior make more sense, considering that the code above is meaningful on at least one processor, and C bends over backwards to accommodate far stranger (and scarcer!) targets than the AVR?

101) The implementation may treat any register declaration simply as an auto declaration. However, whether or not addressable storage is actually used, the address of any part of an object declared with storage-class specifier register cannot be computed, either explicitly (by use of the unary & operator as discussed in 6.5.3.2) or implicitly (by converting an array name to a pointer as discussed in 6.3.2.1). Thus, the only operator that can be applied to an array declared with storage-class specifier register is sizeof.

I just changed a CPU register through a pointer. Wat?!

Furthermore, using the GCC explicit register variables extension, it is possible to direct the compiler to place a variable into a specific register. In this case, you can get a pointer that aliases with a register variable, as below:

register int ri asm("r15") = 0;
int* pi = (int*)0x15;
/* pi now aliases ri */
*pi = 42;
/* ri is 42 now */
assert(ri == 42);

How does GCC deal with such a case? It strikes me as truly bizarre that something like this has not been considered...or has it?

1

There are 1 answers

3
Alex Celeste On

C is an abstract language defined without knowledge of the machine that will eventually implement it. The definition of C does not assume that the underlying machine will even have registers in the conventional form (or a stack, or contiguous memory, or many other things irrelevant to this question that are present on real machines).

The point being that register does not mean that the variable should be assigned a machine register. The meaning of the keyword is that the variable cannot have its address taken; the compiler is then theoretically able to perform better optimisations on it because it reduces the number of paths through which the variable can potentially be modified. Taking the address of a register variable isn't meaningful in C, regardless of what processor it runs on, because register is an incredibly badly-named keyword (named for the most obvious optimisation it enables) that specifically means the address should not be taken. That is all it means.

An intelligent compiler for the AVR should be able to make that optimisation without needing you to hint at it, anyway (in practice the keyword is useless precisely because any halfdecent compiler can detect when it would be applicable anyway, since there's basically no well-defined way to reference an auto object without taking its address explicitly).