Const overload unexpectedly called in gcc. Compiler bug or compatibility fix?

257 views Asked by At

We have a much larger application that relies on template overloading of char and const char arrays. In gcc 7.5, clang, and visual studio, the code below prints "NON-CONST" for all cases. However, for gcc 8.1 and later, the output is shown below:

#include <iostream>

class MyClass
{
public:
    template <size_t N>
    MyClass(const char (&value)[N])
    {
        std::cout << "CONST " << value << '\n';
    }

    template <size_t N>
    MyClass(char (&value)[N])
    {
        std::cout << "NON-CONST " << value << '\n';
    }
};

MyClass test_1()
{
    char buf[30] = "test_1";
    return buf;
}

MyClass test_2()
{
    char buf[30] = "test_2";
    return {buf};
}

void test_3()
{
    char buf[30] = "test_3";
    MyClass x{buf};
}

void test_4()
{
    char buf[30] = "test_4";
    MyClass x(buf);
}

void test_5()
{
    char buf[30] = "test_5";
    MyClass x = buf;
}

int main()
{
    test_1();
    test_2();
    test_3();
    test_4();
    test_5();
}

The gcc 8 and 9 output (from godbolt) is:

CONST test_1
NON-CONST test_2
NON-CONST test_3
NON-CONST test_4
NON-CONST test_5

This appears to me to be a compiler bug, but I guess it could be some other issue related to a language change. Does anybody know definitively?

2

There are 2 answers

7
StoryTeller - Unslander Monica On BEST ANSWER

When you return a plain id-expression from a function (that designated a function local object), the compiler is mandated to do overload resolution twice. First it treats it as though it was an rvalue, and not an lvalue. Only if the first overload resolution fails, will it be performed again with the object as an lvalue.

[class.copy.elision]

3 In the following copy-initialization contexts, a move operation might be used instead of a copy operation:

  • If the expression in a return statement is a (possibly parenthesized) id-expression that names an object with automatic storage duration declared in the body or parameter-declaration-clause of the innermost enclosing function or lambda-expression, or

  • ...

overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue. If the first overload resolution fails or was not performed, or if the type of the first parameter of the selected constructor is not an rvalue reference to the object's type (possibly cv-qualified), overload resolution is performed again, considering the object as an lvalue. [ Note: This two-stage overload resolution must be performed regardless of whether copy elision will occur. It determines the constructor to be called if elision is not performed, and the selected constructor must be accessible even if the call is elided.  — end note ]

If we were to add an rvalue overload,

template <size_t N>
MyClass (char (&&value)[N])
{
    std::cout << "RVALUE " << value << '\n';
}

the output will become

RVALUE test_1
NON-CONST test_2
NON-CONST test_3
NON-CONST test_4
NON-CONST test_5

and this would be correct. What is not correct is GCC's behavior as you see it. It considers the first overload resolution a success. That is because a const lvalue reference may bind to an rvalue. However, it ignores the text "or if the type of the first parameter of the selected constructor is not an rvalue reference to the object's type". According to that it must discard the result of the first overload resolution, and do it again.

Well, that's the situation up to C++17 anyway. The current standard draft says something different.

If the first overload resolution fails or was not performed, overload resolution is performed again, considering the expression or operand as an lvalue.

The text from up to C++17 was removed. So it's a time traveling bug. GCC implement the C++20 behavior, but it does so even when the standard is C++17.

0
Nathan Chappell On

There is a debate about whether or not this is "intuitive behavior" in the comments, so I thought I would take a stab at the reasoning behind this behavior.

There's a pretty nice talk that was given at CPPCON that makes this a bit more clear to me {talk, slides}. Basically, what does a function which takes a non-const reference imply? That the input object must be read/write. Even stronger, it implies I intend to modify this object, this function has side effects. A const ref implies read only, and rvalue ref means I may take the resources. If test_1() were to end up calling the NON-CONST constructor, it would mean I intend to modify this object, even though after I'm done it no longer exists, which (I think) would be a bug (I'm thinking of a case where how a reference is bound during initialization depends on if the argument passed in is const or not).

What's a bit more concerning to me is the subtlety introduced by test_2(). Here, copy-list-initialization is taking place instead of the rules regarding [class.copy.elision] quoted above. Now you're really saying return an object of MyClass type as if I had initialized it with buf, so the NON-CONST behavior is invoked. I've always thought of init-lists as ways of being more concise, but here the braces make a significant semantic difference. This would matter more if the constructors for MyClass took a large number of arguments. Then, say you wished to create a buf, modify it, then return it with the large number of arguments, invoking the CONST behavior. E.g., say you have the constructors:

template <size_t N>
MyClass(const char (&value)[N], int)
{
    std::cout << "CONST int " << value << '\n';
}

template <size_t N>
MyClass(char (&value)[N], int)
{
    std::cout << "NON-CONST int " << value << '\n';
}

And test:

MyClass test_0() {
    char buf[30] = "test_0";
    return {buf,0};
}

Godbolt tells us we get NON-CONST behavior, even though CONST is probably what we want (after you've drunk the cool-aid on function-argument semantics). But now the copy-list initialization does not do what we'd like. The following test sort of makes my point better:

MyClass test_0() {
    char buf[30] = "test_0";
    buf[0] = 'T';
    const char (&bufR)[30]{buf};
    return {bufR,0};
}
// OUTPUT: CONST int Test_0

Now to get the proper semantics with the copy-list initialization, the buffer needs to be "rebound" at the end. I guess if the goal were that this object were to initialize some other MyClass object, just using the NON-CONST behavior in the return copy-list would be okay if the move/ copy-constructor invoked whatever the appropriate behavior is, but that is starting to sound pretty delicate.