How do compilers detect usage of unassigned local variables?

107 views Asked by At

Compilers for languages like Java or C# will complain if you attempt to use a local variable which has not (definitely) been assigned/initialized yet.

I wonder how that functionality is implemented in the compiler. Obviously, the information about the initialization status of a variable can be saved as a boolean flag and be set accordingly after an assignment statement has been detected. But what about (nested) sub-scopes like loop bodies or conditional statements?

2

There are 2 answers

2
Erich Kitzmueller On

This is relatively easy. Every possible code execution path must lead to an assignment before usage of the variable. Loops are treated as possible paths, too; the repetition doesn't matter for this kind of analysis.

1
hello_hell On

This can be achieved computing liveness information.

Compilers generally translate the source code into a lower-level intermediate representation (IR), divide that code into basic blocks (jumpless code) and from there they build a control-flow graph (CFG).

The liveness analysis can compute LiveOut sets for each basic block. If a variable is in the LiveOut set of some basic block that means that the variable will be used in a subsequent block without before be killed (assigned to).

The CFG has two special nodes: an ENTRY node and an EXIT node. If a variable is in the ENTRY node's LiveOut set that means that the variable will be used without before be assigned to a value.

Pointers can complicate this analysis. For example, consider the following code

 int *p, x, y;
 ...
 *p = 123;
 y = x*2;

In order to not report false positives the compiler must do what is called Pointer-analysis. What this analysis do is compute for each pointer the set of posible targets that pointer may (or must) point to. In the above example, if the compiler discovers that p points to x, then x is not uninitialized when used in the line that follows.