I am working on a hobby retargetable C compiler in OCaml and I'm building it bottom up. So far I have an annotated AST type, abridged:
type 'e expr =
| Int of 'e * int
| Var of 'e * var
| Neg of 'e * 'e expr
| Add of 'e * 'e expr * 'e expr
| Sub of 'e * 'e expr * 'e expr
and a three-address code type (again abridged):
type node = Copy of location * location
| Unary of location * unary_op * location
| Binary of location * location * binary_op * location
and location = Temp of int | Int of int | Var of string
and unary_op = Neg
and binary_op = Add | Sub
I have functions written that will transform an AST into a list of TAC nodes ignoring the annotations. Regarding this I have two questions:
What do I do differently when transforming a type-annotated AST to a list of TAC nodes? Should I add annotations to TAC nodes too? This would allow me to later transform high level
int
/char
types into lower-level ones likeI16
/I8
.How do I handle scoping? What if I have two
Var
s that have the same name in different scopes?
How you pass annotations to your TAC is a very open question, but I'd agree that you probably want to do so.
One approach to scoping is name erasure; as you resolve scopes, you replace each unique identifier with a unique "name" (or directly with a reference to the symbol table entry.) (This is sometimes called gensymming, after the traditional Lisp
gensym
function.) More formally, it is α-reduction, a term taken from the λ calculus. This will work for languages, like C, in which names are not available to the runtime.Languages in which runtime introspection has access to names (Python, Javascript) make this process somewhat more complicated, but you can still associate each use of a name with a specific scope. In languages where scopes can be dynamic (Perl, Lisp), you would have to introduce a name resolution operation into your TAC.