How is code stored and executed on the C++ abstract machine?

1.4k views Asked by At

In the first book I read about C++, it went a little bit into the details of how code is actually executed on a machine (it mentioned the program counter, the call stack, return addresses, and such). I found it really fascinating to get to know how this stuff works, although I'm aware that it isn't really necessary to know how the computer works to write good code.

When reading up on the same subjects on this Q/A site, I found out that it by no means has to be the way I had learned before, because what I had read about only was a certain implementation of C++, depending on certain computer architecture and a certain compiler. C++ code could as well run on something completely else, as long as one has a compliant compiler which behaves the "right" way. What the right way is then defined by the standard and the behavior of an "abstract machine" (I hope I got it right so far).

Of course, I'd still like to know whether concepts like the code-segment of memory or the program counter are still "somehow" pictured in the standard, and if they are, to what extent are they pictured? How is the concept of code-pieces being executed one after another described in the abstract machine?

Since it was asked in a comment whether I'd like to have the standard repeated to me: I wasn't able to understand the standard well enough to pin down exactly what it says about the abstract machine / OR which statements of the standard can be interpreted as statements about an abstract concept of "program counter" "Code storage" ... etc. So yes, out of inability, I ask the community to interpret what's written in the standard. The expected outcome of this interpretation is the most detailed conception of the internal structure of the abstract machine that still matches the criterion of being "abstract".

2

There are 2 answers

5
Chris Dodd On

Short answer: it's not.

We don't actually execute code on the abstract machine of the C++ spec (or any abstract machine -- other languages also define them). We execute code on real machines implemented with transistors, or in software running on transistors. The abstract machine in the language spec is used to define boundaries about what the code on the real machine will do -- it must run "as if" it is running on the abstract machine, at least as far as the appearence to the environment of the abstract machine definition is concerned.

The relevant quote from the standard is:

A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input.

There's no real solid definition of what exactly "observable behavior" is, however.

So why even define these abstract machines? Well, mostly because there are many different real machines and you want to say that your code will run the same way on any of them. Real machines are also very complex and hard to reason about. So the language spec defines an abstract machine that is a simplification of the kinds of real machines it expects to run on. Particularly with respect to the details of how code is stored an executed, those details are mostly "abstracted away" in the abstract machine -- it doesn't specify, so an implementation can use whatever mechanisms the real target provides and still be compliant with the spec.

1
HolyBlackCat On

The standard doesn't specify how the abstract machine works internally, that's the whole point. This concept is used to abstract away inner workings of physical machines.

code-segment of memory or the program counter are still "somehow" pictured in the standard

No. The standard just says (roughly speaking) than statements are executed sequentally, explains the evaluation order, etc. It doesn't have a notion of processor instructions or program counter. Function pointers are described as completely opaque, pointing to "functions" rather than individual instructions. It doesn't even guarantee that functions are stored in the same memory as the data.

The standard also doesn't introduce the concepts of the stack and the heap. It only describes what is the lifetime of objects created in different ways. The pointers are carefully described to not restrict them to be scalars. There's no notion of registers, cache, ...