AUTOSAR rule A5-0-4 stance on pointer arithmetic

141 views Asked by At

AUTOSAR rule A5-0-4 states

Pointer arithmetic shall not be used with pointers to non-final classes.

It provides the following rationale:

Pointer arithmetic is only well defined if the pointed-to type of the pointer equals the element type of the array it points into, otherwise the behavior is undefined. This property can only be guaranteed if the pointer operand is a pointer to non-class type or a pointer to final class type.

It then gives some examples of compliance and non-compliance. One of the non-compliances has me puzzled:

void Foo(Base *start, size_t len)
{
  // Non-Compliant: pointer arithmetic on non-final pointer type
  for (Base *iter = start; iter != start + len; ++iter)
  {
    iter->Do();
  }
}

However unlikely it is that anyone would use a plain C-array to store polymorphic pointers, it's in the very fabric of the C++ language.

So my gut feeling is that there is nothing wrong with the above code.

Perhaps I am mistaken in this belief. Or perhaps I am completely missing the point that this AUTOSAR rule is trying to convey to the reader.

Can someone explain it better than the AUTOSAR document does?

See also AUTOSAR rule A5-0-4, which gives the full code used in the example.

2

There are 2 answers

3
Jerry Coffin On BEST ANSWER

I suspect you're missing the point. Consider something like this:

class Base {
   int x;
public:
   virtual void Do() = 0;
   virtual ~Base() = default;
};

class Derived : public Base {
    int y;
public:
    virtual void Do() override {
         ++y;
    }
};

int main() {
    Derived data[10];

    Foo(data, 10); // using definition of `Foo` from question
}

This has undefined behavior. Foo expects to receive the address of the beginning of an array of Base. But we're passing a pointer to Derived instead of a pointer to Base. Since Foo "thinks" it's manipulating pointers to base, when its loop does ++iter, it's going to increment the address by sizeof(Base).

As we've defined it, however, a Derived is almost certainly larger than a Base. After the increment iter almost certainly won't point to the second Derived object, and when we try to call Do on that pointer, things will go sideways (no guarantee, but in a typical implementation, it'll end up trying to use data[0].y as a vtable pointer to find the address of Do).

With the code precisely as it stands now, a compiler with decent optimization can probably cover up the problem. Since Base has a pure virtual, that can't be instantiated. There's only one class in the program derived from Base, so it can conclude that anytime it's dealing with Base objects via a pointer (or reference) they must really be instances of Derived, and act accordingly.

To be sure of seeing the problem, we sort of need to have two derived classes, preferably of different sizes, and each with its own definition of Do, so the compiler can't statically determine that there's only one possible type involved.

#include <iostream>

class Base {
protected:
   int x {1};
public:
   virtual void Do() = 0;
   virtual ~Base() = default;
};

class Derived1 : public Base {
    int y;
public:
    virtual void Do() override {
        ++y;
        std::cout << "x: " << x << ", y: " << y << "\n";
    }
};

class Derived2 : public Base {
    int y;
    int z;
public:
    virtual void Do() override {
        ++z;
        std::cout << "x: " << x << ", y: " << y << ", z: " << z << "\n";
    }
};

void Foo(Base *start, size_t len)
{
  // Non-Compliant: pointer arithmetic on non-final pointer type
  for (Base *iter = start; iter != start + len; ++iter)
  {
    iter->Do();
  }
}

int main() {
    Derived1 data1[10];
    Derived2 data2[10];

    std::cout << "data 1:\n";
    Foo(data1, 10);
    std::cout << "\ndata 2:\n";
    Foo(data2, 10);
}

When I attempt to compile/run this, I get:

data 1:
x: 1, y: 1
x: 1, y: 32737
x: 1, y: 1
x: 1, y: 1
x: 1, y: 32737
x: 1, y: 0
x: 1, y: 32737
x: 1, y: 32737
x: 1, y: 32737
x: 1, y: 1

data 2:
x: 1, y: 32767, z: -576197143
Segmentation fault (core dumped)

So in this case, "goes sideways" translates to a segmentation fault (but you can't entire count on that as the only possible result).

Execution on Godbolt.org gives a similar result.

1
SoronelHaetir On

The questioned example would be fine if it were

void Foo(Base **start, size_t len);

That is with start being of type Base **, but as-is each increment is of a sizeof(Base) object rather than sizeof(Base*).