Is this method of pointer tagging in C standard-compliant?

1k views Asked by At

(Recap on pointer tagging: where the size of an object means that a limited number of bits in its pointer will always go unused and can be re-purposed for other uses, such as marking the object's type.)

An excellent answer to my previous question on this subject confirmed that the naive method of converting pointers to integers and doing things to those integers cannot technically be relied upon to work (disregarding its popularity in practice).

Upon thinking about it some more, I think I have a solution that works for the specific case described in the original question (all objects are the same size, all objects are allocated from a single "heap" array). Can someone confirm my reasoning though?

// given:
typedef Cell ...; // such that sizeof(Cell) == 8
Cell heap[1024];  // or dynamic, w/e

// then:
void * tagged = ((char *)&heap[42]) + 3;  // pointer with tag 3 (w/e that means)

int tag = ((char *)tagged - (char *)heap) % sizeof(Cell);  // 3
Cell * ptr = (Cell *)((char *)tagged - tag);               // &heap[42]

In words: no assumptions are made about the integer representation of a pointer. Tags are applied by indexing the bytes within the pointed-to object. (This much is certainly allowed.)

Pointer subtraction returns the difference in the indices of two objects within the same array. The addresses of bytes within objects should be contiguous and therefore a tagged value is converted to a tag by getting the index of the addressed byte and removing the size of all of the preceding cells from that index; the tagged pointer can be restored to a Cell pointer by removing the now-known index offset.

Is this all compliant and therefore a portable method of pointer tagging? Is pointer subtraction still allowed to work if the type of the array is converted to something else, char in this case? Can I use the sizeof(Cell) in this way?

(No, I don't know why this technicality preys on my mind so much; yes, portability is easily achieved by other means.)

1

There are 1 answers

0
Jens Gustedt On BEST ANSWER

The only thing where I think you'd have to be more carefull, here, is your integer types. Don't use int:

  • the correct type for differences of pointers is ptrdiff_t.
  • the result of sizeof is size_t an unsigned type
  • You are doing % between size_t type and ptrdiff_t so the result is most probably size_t, so an unsigned value
  • converting that size_t to int is implementation defined (so not portable), usually it will just drop the high order bits

int will work for a long time but the day you use that with a very large array on a 64 bit processor you will regret it (after days of debugging)