What is "formal semantics"?

6k views Asked by At

I'm reading a very silly paper and it keeps on talking about how Giotto defines a "formal semantics".

Giotto has a formal semantics that specifies the meaning of mode switches, of intertask communication, and of communication with the program environment.

I'm on the edge of, but just cannot quite grasp what it means by "formal semantics."

4

There are 4 answers

3
Michael Madsen On BEST ANSWER

Formal semantics describe semantics in - well, a formal way - using notation which expresses the meaning of things in an unambiguous way.

It is the opposite of informal semantics, which is essentially just describing everything in plain English. This may be easier to read and understand, but it creates the potential for misinterpretation, which could lead to bugs because someone didn't read a paragraph the way you intended them to read it.

A programming language can have both formal and informal semantics - the informal semantics would then serve as a "plain-text" explanation of the formal semantics, and the formal semantics would be the place to look if you're not sure what the informal explanation really means.

2
Michael Williamson On

To expand on Michael Madsen's answer a little, an example might be the behaviour of the ++ operator. Informally, we describe the operator using plain English. For instance:

If x is a variable of type int, ++x causes x to be incremented by one.

(I'm assuming no integer overflows, and that ++x doesn't return anything)

In a formal semantics (and I'm going to use operational semantics), we'd have a bit of work to do. Firstly, we need to define a notion of types. In this case, I'm going to assume that all variables are of type int. In this simple language, the current state of the program can be described by a store, which is a mapping from variables to values. For instance, at some point in the program, x might be equal to 42, while y is equal to -5351. The store can be used as a function -- so, for instance, if the store s has the variable x with the value 42, then s(x) = 42.

Also included in the current state of the program is the remaining statements of the program we have to execute. We can bundle this up as <C, s>, where C is the remaining program, and s is the store.

So, if we have the state <++x, {x -> 42, y -> -5351}>, this is informally a state where the only remaining command to execute is ++x, the variable x has value 42, and the variable y has value -5351.

We can then define transitions from one state of the program to another -- we describe what happens when we take the next step in the program. So, for ++, we could define the following semantics:

<++x, s> --> <skip, s{x -> (s(x) + 1)>

Somewhat informally, by executing ++x, the next command is skip, which has no effect, and the variables in the store are unchanged, except for x, which now has the value that it originally had plus one. There's still some work to be done, such as defining the notation I used for updating the store (which I've not done to stop this answer getting even longer!). So, a specific instance of the general rule might be:

<++x, {x -> 42, y -> -5351}> --> <skip, {x -> 43, y -> -5351}>

Hopefully that gives you the idea. Note that this is just one example of formal semantics -- along with operational semantics, there's axiomatic semantics (which often uses Hoare logic) and denotational semantics, and probably plenty more that I'm not familiar with.

As I mentioned in a comment to another answer, an advantage of formal semantics is that you can use them to prove certain properties of your program, for instance that it terminates. As well as showing your program doesn't exhibit bad behaviour (such as non-termination), you can also prove that your program behaves as required by proving your program matches a given specification. Having said that, I've never found the idea of specifying and verifying a program all that convincing, since I've found the specification usually just being the program rewritten in logic, and so the specification is just as likely to be buggy.

0
datageist On

Just like the syntax of a language can be described by a formal grammar (e.g. BNF), it's possible to use different kinds of formalisms to map that syntax to mathematical objects (i.e. the meaning of the syntax).

This page from A Practical Introduction to Denotational Semantics is a nice introduction to how [denotational] semantics relates to syntax. The beginning of the book also gives a brief history of other, non-denotational approaches to formal semantics (although the wikipedia link Michael gave goes into even more detail, and is probably more up-to-date).

From the author's site:

Models for semantics have not caught-on to the same extent that BNF and its descendants have in syntax. This may be because semantics does seem to be just plain harder than syntax. The most successful system is denotational semantics which describes all the features found in imperative programming languages and has a sound mathematical basis. (There is still active research in type systems and parallel programming.) Many denotational definitions can be executed as interpreters or translated into "compilers" but this has not yet led to generators of efficient compilers which may be another reason that denotational semantics is less popular than BNF.

0
nondeterministic On

What is meant in the context of a programming language like Giotto is, that a language with formal semantics, has a mathematically rigorous interpretation of its individual language constructs.

Most programming languages today are not rigorously defined. They may adhere to standard documents that are fairly detailed, but it's ultimately the compiler's responsibility to emit code that somehow adheres to those standard documents.

A formally specified language on the other hand is normally used when it is necessary to do reasoning about program code using, e.g., model checking or theorem proving. Languages that lend themselves to these techniques tend to be functional ones, such as ML or Haskell, since these are defined using mathematical functions and transformations between them; that is, the foundations of mathematics.

C or C++ on the other hand are informally defined by technical descriptions, although there exist academic papers which formalise aspects of these languages (e.g., Michael Norrish: A formal semantics for C++, https://publications.csiro.au/rpr/pub?pid=nicta:1203), but that often do not find their way into the official standards (possibly due to a lack of practicality, esp. difficulty to maintain).