What's the meaning of "reserved for any use"?

5.2k views Asked by At

NOTE: This is a question, though I added in case some C++ expert can provide a rationale or historical reason why C++ is using a different wording than C.


In the C standard library specification, we have this normative text, C17 7.1.3 Reserved identifiers (emphasis mine):

  • All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use.
  • All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces.

Now I keep reading answers on SO by various esteemed C experts, where they claim it is fine for a compiler or standard library to use identifiers with underscore + uppercase, or double underscore.

Doesn't "reserved for any use" mean reserved for anyone except future extensions to the C language itself? Meaning that the implementation is not allowed to use them.

While the second phrase above, regarding single leading underscore seems to be directed to the implementation?

In general, the C standard is written in a way that expects compiler vendors/library implementers to be the typical reader - not so much the application programmers.

Notably, C++ has a very different wording:

  • Each name that contains a double underscore (__) or begins with an underscore followed by an uppercase letter (2.11) is reserved to the implementation for any use.

(See What are the rules about using an underscore in a C++ identifier?)

Is this perhaps a mix-up between C and C++ and the languages are different here?

6

There are 6 answers

5
zwol On BEST ANSWER

In the C standard, the meaning of the term "reserved" is defined by 7.1.3p2, immediately below the bullet list you are quoting:

No other identifiers are reserved. If the program declares or defines an identifier in a context in which it is reserved (other than as allowed by 7.1.4), or defines a reserved identifier as a macro name, the behavior is undefined.

Emphasis mine: reserved identifiers place a restriction on the program, not the implementation. Thus, the common interpretation – reserved identifiers may be used by the implementation to any purpose – is correct for C.

I have not kept up with the C++ standard and no longer feel qualified to interpret it.

2
supercat On

The C Standard allows implementations to attach any meaning they see fit to reserved identifiers. Most implementations will treat unrecognized identifiers of reserved forms the same as any other recognized identifiers when there is no reason to do otherwise, thus allowing something like:

#ifdef __ACME_COMPILER
#define near __near
#else
#define near
#endif

int near foo;

to declare an identifier foo using a __near qualifier if the code is being processed in an Acme compiler (which would presumably support such a thing), but also be compatible with other compilers that would not require or benefit from the use of such a directive. Nothing would forbid a conforming implementation from defining __ACME_COMPILER and interpreting __near to mean "launch nuclear missiles", but a quality implementation shouldn't go out of its way to break code like the above. If an implementation doesn't know what __ACME_COMPILER is supposed to mean, treating it like any other unknown identifier would allow it to support useful constructs like the above.

0
David Hammen On

C has multiple contexts in which a symbol can have a definition:

  • The space of macro names,
  • The space of formal names of arguments to a macro (this space is specific to each function-like macro),
  • The space of ordinary identifiers,
  • The space of tag names,
  • The space of labels (this space is specific to each function), and
  • The space of structure/union members (this space is specific to each struct/union).

What "reserved for any use" means that the user code in a compliant program cannot use1 symbols that start with an underscore that is followed by an uppercase letter or another underscore in any of the above contexts. Compare with identifiers that start with a single underscore but are followed by a lowercase number or a digit. This falls into the second class of identifiers that start with an underscore. User code can can be use these identifiers as the names of macro arguments, as labels, or as the names of structure/union members.

"Reserved for any use" does not mean that the implementation cannot use such symbols. The intent of the reservation is to provide a name space that implementations can freely use without concern that the names defined by the implementation will conflict with the names defined by the user code in a compliant program.


1The standard does not quite mean "cannot use". The standard encourages the programmatic use of a small number of names that start with a double underscore. For example, a compliant implementation is required to define __STDC_VERSION__, __FILE__, __LINE__, and __func__. The 2011 version of the standard even gives an example of a presumably compliant program that references __func__.

6
Lundin On

Regarding the difference in wording in C versus C++, I'm posting my own little research here as reference:

...names which are intended for use only by functions of the library begin with an underscore so they are less likely to collide with names in a user's program.

  • K&R 2nd edition added an Appendix B which addresses the standard library, where we can read

External identifiers that begin with an underscore are reserved for use by the library, as are all other identifiers that begin with an underscore and an upper-case letter or another underscore.

  • Early ANSI C drafts, as well as "C90" ISO 9899:1990, has the same text as in the current ISO standard.

  • The earliest C++ drafts however, has a different text, as noted by @hvd, possibly a clarification of the C standard. From DRAFT: 20 September 1994:

17.3.3.1.2 Global names
...
Each name that begins with an underscore and either an uppercase letter or another underscore (2.8) is reserved to the implementation for any use

So apparently the wording "reserved for any use" was invented by the ANSI/ISO C90 committee, whereas the C++ committee some years later used a clearer wording, similar to the wording in the pre-standard K&R book.


The C99 rationale V5.10 says this below 7.1.3:

Also reserved for the implementor are all external identifiers beginning with an underscore, and all other identifiers beginning with an underscore followed by a capital letter or an underscore. This gives a name space for writing the numerous behind-the-scenes non-external macros and functions a library needs to do its job properly.

This makes the committee's intention quite clear: "reserved for any use" means "reserved for the implementor".


Also of note, the current C standard has the following normative text elsewhere, in 6.2.5:

There may also be implementation-defined extended signed integer types. 38)

where the informative foot note 38 says:

  1. Implementation-defined keywords shall have the form of an identifier reserved for any use as described in 7.1.3.
2
Sneftel On

While the Standard is primarily written to guide implementers, it is written as a description of what makes a program well-formed, and what its effect is. That's because the basic definition of a standards-conforming compiler is one that does the correct thing for any standards-conforming program:

A strictly conforming program shall use only those features of the language and library specified in this International Standard....A conforming hosted implementation shall accept any strictly conforming program.

Read separately, this is hugely restrictive of extensions to a compiler. For instance, based solely on that clause, a compiler shouldn't get to define any of its own reserved words. After all, any given word a particular compiler might want to reserve, could nevertheless show up in a strictly conforming program, forcing the compiler's hand.

The standard goes on, however:

A conforming implementation may have extensions (including additional library functions), provided they do not alter the behavior of any strictly conforming program.

That's the key piece. Compiler extensions need to be written in such a way that they affect nonconforming programs (ones which contain undefined behavior, or which shouldn't even compile at all), allowing them to compile and do fun extra things.

So the purpose of defining "reserved identifiers", when the language doesn't actually need those identifiers for anything, is to give implementations some extra wiggle room by providing them with some things which make a program nonconforming. The reason a compiler can recognize, say, __declspec as part of a declaration is because putting __declspec into a declaration is otherwise illegal, so the compiler is allowed to do whatever it wants!

The importance of "reserved for any use", therefore, is that it leaves no question about a compiler's power to treat such identifiers as having any meaning it cares to. Future compatibility is a comparatively distant concern.

The C++ standard works in a similar way, though it's a bit more explicit about the gambit:

A conforming implementation may have extensions (including additional library functions), provided they do not alter the behavior of any well-formed program. Implementations are required to diagnose programs that use such extensions that are ill-formed according to this International Standard. Having done so, however, they can compile and execute such programs.

I suspect the difference in wording is down to the C++ standard just being clearer about how extensions are meant to work. Nevertheless, nothing in the C standard precludes an implementation from doing the same thing. (And we all basically ignore the requirement that the compiler warn you every time you use __declspec.)

0
thb On

It is months late but one point remains the others have not addressed.

Your question can be viewed from the opposite direction. The standard allows the implementation (as you have observed) to use a symbol like _Foo but, more importantly, thereby forbids the implementation from using foo. The latter is reserved for your use.

To understand, for discussion's sake, suppose that a future C standard introduced the new keyword _Foo. The hypothetical implementation was already using this symbol, so what happens?

Answer:

  1. At first, the implementation will not yet have implemented the new standard. Until implemented, the new standard lacks practical effect.

  2. Later, as part of implementing the new standard, the implementation quietly changes each _Foo to _Bar.

No problem.

In fact, if you think about it in this manner, you can say that the way the standard reserves such words is almost the only way it could reserve them.