When does lvalue-to-rvalue conversion happen, how does it work, and can it fail?

24.3k views Asked by At

I see the term "lvalue-to-rvalue conversion" used in many places throughout the C++ standard. This kind of conversion is often done implicitly, as far as I can tell.

One unexpected (to me) feature of the phrasing from the standard is that they decide to treat lvalue-to-rvalue as a conversion. What if they had said that a glvalue is always acceptable instead of a prvalue? Would that phrase actually have a different meaning? For example, we read that lvalues and xvalues are examples of glvalues. We don't read that lvalues and xvalues are convertible to glvalues. Is there a difference in meaning?

Before my first encounter with this terminology, I used to model lvalues and rvalues mentally more or less as follows:

lvalues are always able to act as rvalues, but in addition can appear on the left side of an =, and to the right of an &.

This, to me, is the intuitive behavior that if I have a variable name, then I can put that name everywhere where I would have put a literal. This model seems consistent with lvalue-to-rvalue implicit conversions terminology used in the standard, as long as this implicit conversion is guaranteed to happen.

But, because they use this terminology, I started wondering whether the implicit lvalue-to-rvalue conversion may fail to happen in some cases. That is, maybe my mental model is wrong here. Here is the relevant wording in [basic.lval] p6 (thanks to the commenters):

Whenever a glvalue appears as an operand of an operator that requires a prvalue for that operand, the lvalue-to-rvalue, array-to-pointer, or function-to-pointer standard conversions are applied to convert the expression to a prvalue.

[Note: An attempt to bind an rvalue reference to an lvalue is not such a context; see [dcl.init.ref]. — end note]

I understand what they describe in the note is the following:

int x = 1;
int && y = x; //in this declaration context, x won't bind to y.
// but the literal 1 would have bound, so this is one context where the implicit 
// lvalue to rvalue conversion did not happen.  
// The expression on right is an lvalue. if it had been a prvalue, it would have bound.
// Therefore, the lvalue to prvalue conversion did not happen (which is good). 

So, my question is (are):

  1. Could someone clarify the contexts where this conversion can happen implicitly? Specifically, other than the context of binding to an rvalue reference, are there any other where lvalue-to-rvalue conversions fail to happen implicitly?

  2. I rvalue-reference binding is not a context where we expect a prvalue expression (on the right)?

  3. Like other conversions, does the glvalue-to-prvalue conversion involve work at runtime that would allow me to observe it?

My aim here is not to ask if it is desirable to allow such a conversion. I'm trying to learn to explain to myself the behavior of this code using the standard as starting point.


A good answer would go through the quote I placed above and explain (based on parsing the text) whether the note in it is also implicit from its text. It would then maybe add any other quotes that let me know the other contexts in which this conversion may fail to happen implicitly, or explain there are no more such contexts. Perhaps a general discussion of why glvalue to prvalue is considered a conversion.

3

There are 3 answers

13
dyp On BEST ANSWER

I think the lvalue-to-rvalue conversion is more than just use an lvalue where an rvalue is required. It can create a copy of a class, and always yields a value, not an object.

I'm using n3485 for "C++11" and n1256 for "C99".


Objects and values

The most concise description is in C99/3.14:

object

region of data storage in the execution environment, the contents of which can represent values

There's also a bit in C++11/[intro.object]/1

Some objects are polymorphic; the implementation generates information associated with each such object that makes it possible to determine that object’s type during program execution. For other objects, the interpretation of the values found therein is determined by the type of the expressions used to access them.

So an object contains a value (can contain).


Value categories

Despite its name, value categories classify expressions, not values. lvalue-expressions even cannot be considered values.

The full taxonomy / categorization can be found in [basic.lval]; here's a StackOverflow discussion.

Here are the parts about objects:

  • An lvalue ([...]) designates a function or an object. [...]
  • An xvalue (an “eXpiring” value) also refers to an object [...]
  • A glvalue (“generalized” lvalue) is an lvalue or an xvalue.
  • An rvalue ([...]) is an xvalue, a temporary object or subobject thereof, or a value that is not associated with an object.
  • A prvalue (“pure” rvalue) is an rvalue that is not an xvalue. [...]

Note the phrase "a value that is not associated with an object". Also note that as xvalue-expressions refer to objects, true values must always occur as prvalue-expressions.


The lvalue-to-rvalue conversion

As footnote 53 indicates, it should now be called "glvalue-to-prvalue conversion". First, here's the quote:

1    A glvalue of a non-function, non-array type T can be converted to a prvalue. If T is an incomplete type, a program that necessitates this conversion is ill-formed. If the object to which the glvalue refers is not an object of type T and is not an object of a type derived from T, or if the object is uninitialized, a program that necessitates this conversion has undefined behavior. If T is a non-class type, the type of the prvalue is the cv-unqualified version of T. Otherwise, the type of the prvalue is T.

This first paragraph specifies the requirements and the resulting type of the conversion. It isn't yet concerned with the effects of the conversion (other than Undefined Behaviour).

2    When an lvalue-to-rvalue conversion occurs in an unevaluated operand or a subexpression thereof the value contained in the referenced object is not accessed. Otherwise, if the glvalue has a class type, the conversion copy-initializes a temporary of type T from the glvalue and the result of the conversion is a prvalue for the temporary. Otherwise, if the glvalue has (possibly cv-qualified) type std::nullptr_t, the prvalue result is a null pointer constant. Otherwise, the value contained in the object indicated by the glvalue is the prvalue result.

I'd argue that you'll see the lvalue-to-rvalue conversion most often applied to non-class types. For example,

struct my_class { int m; };

my_class x{42};
my_class y{0};

x = y;

The expression x = y does not apply the lvalue-to-rvalue conversion to y (that would create a temporary my_class, by the way). The reason is that x = y is interpreted as x.operator=(y), which takes y per default by reference, not by value (for reference binding, see below; it cannot bind an rvalue, as that would be a temporary object different from y). However, the default definition of my_class::operator= does apply the lvalue-to-rvalue conversion to x.m.

Therefore, the most important part to me seems to be

Otherwise, the value contained in the object indicated by the glvalue is the prvalue result.

So typically, an lvalue-to-rvalue conversion will just read the value from an object. It isn't just a no-op conversion between value (expression) categories; it can even create a temporary by calling a copy constructor. And the lvalue-to-rvalue conversion always returns a prvalue value, not a (temporary) object.

Note that the lvalue-to-rvalue conversion is not the only conversion that converts an lvalue to a prvalue: There's also the array-to-pointer conversion and the function-to-pointer conversion.


values and expressions

Most expressions don't yield objects[[citation needed]]. However, an id-expression can be an identifier, which denotes an entity. An object is an entity, so there are expressions which yield objects:

int x;
x = 5;

The left hand side of the assignment-expression x = 5 also needs to be an expression. x here is an id-expression, because x is an identifier. The result of this id-expression is the object denoted by x.

Expressions apply implicit conversions: [expr]/9

Whenever a glvalue expression appears as an operand of an operator that expects a prvalue for that operand, the lvalue-to-rvalue, array-to-pointer, or function-to-pointer standard conversions are applied to convert the expression to a prvalue.

And /10 about usual arithmetic conversions as well as /3 about user-defined conversions.

I'd love now to quote an operator that "expects a prvalue for that operand", but cannot find any but casts. For example, [expr.dynamic.cast]/2 "If T is a pointer type, v [the operand] shall be a prvalue of a pointer to complete class type".

The usual arithmetic conversions required by many arithmetic operators do invoke an lvalue-to-rvalue conversion indirectly via the standard conversion used. All standard conversions but the three that convert from lvalues to rvalues expect prvalues.

The simple assignment however doesn't invoke the usual arithmetic conversions. It is defined in [expr.ass]/2 as:

In simple assignment (=), the value of the expression replaces that of the object referred to by the left operand.

So although it doesn't explicitly require a prvalue expression on the right hand side, it does require a value. It is not clear to me if this strictly requires the lvalue-to-rvalue conversion. There's an argument that accessing the value of an uninitialized variable should always invoke undefined behaviour (also see CWG 616), no matter if it's by assigning its value to an object or by adding its value to another value. But this undefined behaviour is only required for an lvalue-to-rvalue conversion (AFAIK), which then should be the only way to access the value stored in an object.

If this more conceptual view is valid, that we need the lvalue-to-rvalue conversion to access the value inside an object, then it'd be much easier to understand where it is (and needs to be) applied.


Initialization

As with simple assignment, there's a discussion whether or not the lvalue-to-rvalue conversion is required to initialize another object:

int x = 42; // initializer is a non-string literal -> prvalue
int y = x;  // initializer is an object / lvalue

For fundamental types, [dcl.init]/17 last bullet point says:

Otherwise, the initial value of the object being initialized is the (possibly converted) value of the initializer expression. Standard conversions will be used, if necessary, to convert the initializer expression to the cv-unqualified version of the destination type; no user-defined conversions are considered. If the conversion cannot be done, the initialization is ill-formed.

However, it also mentioned the value of the initializer expression. Similar to the simple-assignment-expression, we can take this as an indirect invocation of the lvalue-to-rvalue conversion.


Reference binding

If we see lvalue-to-rvalue conversion as a way to access the value of an object (plus the creation of a temporary for class type operands), we understand that it's not applied generally for binding to a reference: A reference is an lvalue, it always refers to an object. So if we bound values to references, we'd need to create temporary objects holding those values. And this is indeed the case if the initializer-expression of a reference is a prvalue (which is a value or a temporary object):

int const& lr = 42; // create a temporary object, bind it to `r`
int&& rv = 42;      // same

Binding a prvalue to an lvalue reference is prohibited, but prvalues of class types with conversion functions that yield lvalue references may be bound to lvalue references of the converted type.

The complete description of reference binding in [dcl.init.ref] is rather long and rather off-topic. I think the essence of it relating to this question is that references refer to objects, therefore no glvalue-to-prvalue (object-to-value) conversion.

3
callisto On

On glvalues: A glvalue ("generalized" lvalue) is an expression that is either an lvalue or an xvalue. A glvalue may be implicitly converted to prvalue with lvalue-to-rvalue, array-to-pointer, or function-to-pointer implicit conversion.

Lvalue transformations are applied when lvalue argument (e.g. reference to an object) is used in context where rvalue (e.g. a number) is expected.

Lvalue to rvalue conversion
A glvalue of any non-function, non-array type T can be implicitly converted to prvalue of the same type. If T is a non-class type, this conversion also removes cv-qualifiers. Unless encountered in unevaluated context (in an operand of sizeof, typeid, noexcept, or decltype), this conversion effectively copy-constructs a temporary object of type T using the original glvalue as the constructor argument, and that temporary object is returned as a prvalue. If the glvalue has the type std::nullptr_t, the resulting prvalue is the null pointer constant nullptr.

0
Ben Voigt On

Before you dive into the details, you should know that lvalue-to-rvalue conversion means a read from memory.

As such, it can fail at runtime, if the lvalue is invalid (reference formed from an invalid pointer, dangling reference whose target object has gone out of scope, etc) or the result is uninitialized.

Now, compiler optimizations might reorder the actual memory read, combine multiple reads by caching the value in a CPU register, etc. But the meaning is always a fetch of the object's existing value.

You get undefined behavior if a hypothetical memory read would not succeed, even if the optimizer has transformed the code to avoid actually doing a memory read. The possible reordering / speculative reads are one of the sources of "time-travel undefined behavior", where a code path that will reach a fetch from an invalid lvalue starts acting weird earlier.