On which compilers or with which compiler flags, pointer arithmetic rule violation may cause troubles?

90 views Asked by At

The c++ standard constrains pointer arithmetic to be performed within an array ([expr.add]) which makes implementation of vector-like containers difficult.

One could implement a vector-like container with an implementation similar to this:

//First approach

//Allocation
auto buffer = new unsigned char[2*sizeof(int)];
//Construction
auto p=new(buffer) int{};
new(p+1) int{};
//Example of use of an iterator, assign 10 to the second element.
*(p+1)=10;//UB p+1 is a pointer past the end of an object.

This previous peace of code illustrates how approximately std::vector is implemented in libstdc++ and libc++. It seems that compilers accept this kind of code as an extension to the c++ language.

If I want to be standard compliant I could implement a vector and its associated iterator in such a way that operations performed on the vector and its iterator could be simplified to this code:

//Second approach

//Allocation:
auto buffer = new unsigned char[2*sizeof(int)];
//Construction
new(buffer) int{};
new(buffer+sizeof(int)) int{};
//Example of use of an iterator assign 10 to the second element
*(std::launder(reinterpret_cast<int*>(buffer+sizeof(int))))=10;

(First question, is this approach not also UB? Here the pointer arithmetic is performed on the array of unsigned char which provides storage for the int objects. launder is used because buffer and the int objects are not pointer interconvertibles)

The problem with this second approach, is the code generated by the compiler (GCC):

#include <new>

int test_approach_1(unsigned char* buffer){
    //Construction
    auto p = new(buffer) int{};
    new(p+1) int{10};
    //Example of use of an iterator assign 10 to the second element
    *(p+1)=13;//UB
    return *(p+1);//UB
}

int test_approach_2(unsigned char* buffer){
    //Construction
    new(buffer) int{};
    new(buffer+sizeof(int)) int{10};
    //Example of use of an iterator assign 10 to the second element
    *(std::launder(reinterpret_cast<int*>(buffer+sizeof(int))))=13;
    return *(std::launder(reinterpret_cast<int*>(buffer+sizeof(int))));
}

Generated assembly:

test_approach_1(unsigned char*):
        movabs  rax, 55834574848
        mov     QWORD PTR [rdi], rax
        mov     eax, 13
        ret
test_approach_2(unsigned char*):
        movabs  rax, 42949672960
        mov     QWORD PTR [rdi], rax
        mov     eax, 13
        mov     DWORD PTR [rdi+4], 13
        ret

The code generated for test_approach_1 is optimal. So I think I will not use the second approach (And I would have one more reason not to use it if one shows it is also UB.)

I don't find documentation for these extensions to the language that allow us to implement vector-like containers using the first approach (it is UB according to the standard). Is there any documentation for it? On which compiler can I expect it to work and with which compiler flags?

0

There are 0 answers