Passive closure analysis
Last updated 30 April 1996
Read for next time: The third and fourth papers in the readings.
Readings assigned so far: Descriptions of the following compilers:
Gambit
Twobit
Fortran H
PL.8
Background reading: Fischer and LeBlanc chapters 15 and 16.
----------------------------------------------------------------
Passive closure analysis.
Most optimizations consist of some static analysis followed by
one or more code transformations based on that analysis.
Assignment #1 is a passive closure analysis, which marks all of
the escaping function expressions in a program. The analysis
also collects three lists:
all function expressions
all declarations whose right hand side is a function expression
all bound variables
The last of these lists was not required by Assignment #1, but
will be useful for Assignment #2.
----------------------------------------------------------------
Advice for assignment #1.
Compute a list of all variables that are bound in the program,
whether as a formal parameter of a procedure or as a declaration.
This is easy to do by modifying the code that traverses a formal F.
Alternatively you may wish to compute a list of all variables that
are mentioned in the program.
I used the following trick to simplify the recursive traversal:
When I encountered a declaration whose right hand side was a
function expression, I marked the function expression with a -1.
Whenever I traversed a function expression, I incremented its
mark. At the end of the recursive traversal, the function
expressions that escape are marked 1 and those that don't escape
are marked 0.
I used a similar trick for symbols that represent variables. At
the end of the recursive traversal, the variables that escape are
marked with some positive number and those that don't escape are
marked 0.
It will be convenient to add other kinds of marks later on, so it
will save space if each kind of mark uses the smallest number of
bits possible. If you use the trick above, then after the
recursive traversal you can go over the list of variables and
change each positive mark to a 1.
Be sure to implement marks in a representation-independent way,
or your compiler will quickly become difficult to maintain as you
add new kinds of marks.
----------------------------------------------------------------
Using passive closure analysis to generate more efficient code.
There are many ways to improve the generated code based on the
closure analysis performed in Assignment #1.
One of the most straightforward ways is to avoid allocating a
closure for function expressions that do not escape, and to
allocate variables on the stack unless they occur free in some
function expression that escapes.
So we first need to extend the closure analysis of Assignment #1
by marking every variable that occurs free within a function
expression that escapes. Note that this mark is distinct from
the mark used in Assignment #1.
We can then change the code that is generated for
* blocks
* declarations
* variables
* function expressions
* procedure calls
* procedure returns
We will be allocating some ribs in the heap, and others on
the stack. Since the offsets for a heap-allocated rib are
different from the offsets for a stack-allocated rib,
lexical addresses and compile-time environments will have
to become more general.
----------------------------------------------------------------
Lexical addresses.
An optimizing compiler typically uses several different forms of
lexical address, because a variable may be allocated in a register,
on the stack, or in the heap, and the path from a register to the
variable's home may traverse both stack-allocated and heap-allocated
ribs. A variable whose value is a known procedure may not be
allocated at all, but the lexical address for that variable must
contain the procedure's entry point and tell how to compute the
environment in which the procedure was declared.
A fairly general form of lexical address consists of a register,
a possibly empty path, with an optional label. The path describes
the sequence of offsets that must be used to follow the static
chain, and also indicates whether each rib has been allocated
on the stack or the heap.
LexAddr ::= <Register, [Path]>
| <Register, [Path], Label>
Path ::= <empty>
| Tag Int, Path
Tag ::= S | H
For example, if the lexical address of x is <$6, []>, then the
L-value of x can be fetched into $8 by
ori $8,$6,0
If the lexical address of y is
<$fp, [S 0 S 0 H 1 S 12]>
then the L-value of y can be fetched into $4 by
ori $4,$fp,0
lw $4,0($4)
lw $4,0($4)
lw $4,1($4)
lw $4,12($4)
If the lexical address of z is
<$fp, [], L1028>
then the L-value of z is a procedure that can be called via
ori $fp,$fp,0 # redundant, obviously
j L1028
If the lexical address of z is
<$fp, [S 0 H 1], L1028>
then the L-value of z is a procedure that can be called via
ori $fp,$fp,0 # redundant, obviously
lw $fp,0($fp)
lw $fp,1($fp)
j L1028
Notation.
A[I:r] stands for the environment obtained from A by removing any
binding for I and adding the binding I:<r, []>.
A[I:r,L] stands for the environment obtained from A by removing
any binding for I and adding the binding I:<r, [], L>.
A[I:r,tag,o] stands for the environment obtained from A by removing
any binding for I and adding the binding I:<r [tag o]>.
A[+ r,tag,o] stands for the environment B defined as follows:
If the lexical address of x in A is <r, []> or <r, [], L>,
then x is not bound in B. (This should never happen.)
If the lexical address of x in A is <r, [...]> or <r, [...], L>,
then the lexical address of x in B is <r, [tag o ...]> or
<r, [tag o ...], L>.
Otherwise the lexical address of x in B is the same as in A.
A[I:+,tag,o] stands for the environment obtained from A by replacing
the binding I:<r, [...]> or I:<r, [...], L> by I:<r, [tag o ...]> or
I:<r, [tag o ...], L>.
The correspondence between our old and new notations is this:
old notation new notation
------------ ------------
A[I:o] A[I:$fp,H,o]
A[+] A[+ $fp,H,1]
<m, n> <$30, [H 1 ... H n]>
---
m of these
----------------------------------------------------------------
Summary of the new data types.
The operations on a tag are:
return H.
return S.
given two tags, test them for equality.
The operations on a path are:
return the empty path.
given a tag t, an offset o, and a path [P], return [t o P].
given a path, test to see if it is the empty path.
given a nonempty path, return its first tag.
given a nonempty path, return its first offset.
The operations on a lexical address are:
given a register r and a path P, return <r, P>.
given a register r, a path P, and a label L, return <r, P, L>.
given a lexical address <r, P> or <r, P, L>, return r.
given a lexical address <r, P> or <r, P, L>, return P.
given a lexical address <r, P, L>, return L.
The operations on a compile-time environment are:
return an empty environment
return the initial environment (for the standard library)
given A, I, r, return A[I:r].
given A, I, r, L, return A[I:r,L].
given A, I, r, tag, o, return A[I:r,tag,o].
given A, r, tag, o, return A[+ r,tag,o].
given A, I, tag, o, return A[I:+,tag,o].
given an environment A and a variable I, return the lexical
address of I in A.
----------------------------------------------------------------
Some advice for Assignment #2.
Implement these more general lexical addresses and compile-time
environments as new data types. Unless your implementation
language allows overloading, be sure to use different names for
the operations on the new data types.
Implement the old compile-time environments using the new
compile-time environments.
If your compiler is representation independent, then it should
continue to work exactly as before. Test this by comparing the
output of the old compiler with the output of the new compiler.
It should match exactly.
----------------------------------------------------------------
Generating more efficient code.
In some languages we could allocate all variables on the stack.
When a function expression is evaluated, we could just copy the
values of its free variables into the closure.
We can't do this for quirk13 because of a language design bug.
The variables declared by a block are mutually recursive, but
they are initialized left to right. That means a function
expression can close over a variable that hasn't been initialized
yet, as in the following example.
{
proc () -> int make_counter_n (int n) {
if n > 0
then {
array of proc () -> int Counters =
newarray [2 * n]
function (int i) -> proc () -> int {
function () -> int {
state := state + n;
state -- closes over state
}
};
ref int state = new 0;
Counters[n]
}
else make_counter_n (1)
};
proc () -> int counter = make_counter_n (17);
counter ();
counter ();
counter ();
write_int (counter ());
}
One solution to this problem is to close over the rib that the
variable is in, not over the value of the variable. Not only
does that imply an extra load instruction to access that variable
from the body of the function expression, it also means that the
entire rib in which the free variable occurs must be allocated
on the heap.
Expressions must have a new inherited attribute E.stack giving
the number of bytes that have been pushed onto the stack for
the current procedure activation. Tail-recursive calls and
procedure returns must pop these bytes off the stack. (This
causes a problem for some tail-recursive calls, as explained
later.)
Several parts of the code generator must generate code to
follow a path P, starting from some register r0 and evaluating
into some target register r. The code for this is
ori $r,$r0,0
[path(r,P)]
where path(r,[]) is the empty sequence of code, and
path(r,[t o P']) is
lw $r,o($r)
[path(r,P')]
(block [] [E1 ... En])
Don't allocate a new rib.
(block [D1 ... Dm] [E1 ... En])
If any of the variables declared by D1 ... Dm occur free within
an escaping function expression, then generate the same code as
before.
If none of the declared variables occur free within an escaping
function expression, then allocate the new rib on the stack.
This entails the following changes, marked by a #, to our
attribute grammar for code generation:
| (block [D1 ... Dm] let G = [D1 ... Dm]
[E1 ... En]) x = 4*m + 4
in
# G.env = E.env[+ $fp,S,0];
G.reg = E.reg;
# G.offset = 4;
E1.env = ... = En.env = G.newenv;
E1.reg = ... = En.reg = E.reg;
E1.tail = ... = En-1.tail = false;
En.tail = E.tail;
# E1.stack = ... = En.stack = E.stack + x
if E.tail
then E.code =
# addi $sp,$sp,-x
# sw $fp,4($sp)
# addi $fp,$sp,4
[G.code]
[E1.code]
...
[En.code]
else E.code =
# addi $sp,$sp,-x
# sw $fp,4($sp)
# addi $fp,$sp,4
[G.code]
[E1.code]
...
[En.code]
# lw $fp,0($fp)
# addi $sp,$sp,x
(decl F E)
If E is a function expression that does not escape, then
the declaration can be handled specially as follows.
# let L1, L2 = new labels
# in
# D.newenv = D.env[F.id:$fp,L];
E.env = D.fixenv;
E.reg = D.reg;
E.tail = false;
let r = D.reg
x = D.offset
in
D.code =
# b L2
# L1:
# [as for function expressions]
# L2:
There is no need to allocate space for the variable declared
by this declaration. If space is allocated, it should be
zeroed to avoid breaking the garbage collector.
(function [...] [F1 ... Fn] T0 E)
If this function expression does not escape, then there is
no need to create a closure for it. It will be the right
hand side of some declaration, compiled as above.
Otherwise a closure is created as follows.
The formal parameters can safely be allocated on the stack.
Their values have already been computed, so any function
expression that closes over them can copy their values.
Let I1 = F1.id, ..., In = Fn.id.
The function expression cannot be closed in whatever run-time
environment is in $fp, because that environment probably
includes stack-allocated ribs.
We can create a new rib that contains a slot for each of the
function expression's free variables. If that variable is
in a heap-allocated rib, the slot will contain that rib. If
the variable is in a stack-allocated rib, the slot will
contain the variable's value.
A new compile-time environment B must be constructed that
corresponds to the run-time environment in which the function
expression will be closed.
Let {J1, ..., Jm} be the variables that occur free within
this function expression. The code that initializes slot k
of the new rib depends upon the lexical address of Jk in E.env.
Suppose the lexical address of Jk in E.env is
<$r1,[]>
where $r1 is not $fp. Then the code to initialize slot k is
sw $r1,4*k+4-ATAG($r)
and the lexical address of Jk in B is <$fp,[H 4*k+4-ATAG]>.
Suppose the lexical address of Jk in E.env is
<$fp,[... S o]>
Then the code to initialize slot k is
ori $2,$fp,0
[path(2,[... S o])]
sw $2,4*k+4-ATAG($r)
and the lexical address of Jk in B is <$fp,[H 4*k+4-ATAG]>.
Suppose the lexical address of Jk in E.env is
<$fp,[... H o]>
Then the code to initialize slot k is
ori $2,$fp,0
[path(2,[...])] # [...] instead of [... H o]
sw $2,4*k+4-ATAG($r)
and the lexical address of Jk in B is <$fp,[H (4*k+4-ATAG) H o]>.
let L1, L2 = new labels
x = 4*n + 4
# y = 4*m + 4
rn = n + 3
r = E.reg
in
# E0.env = B[+ $fp,S,0]
# [I1:$fp,S,4]
# ...
# [In:$fp,S,4*n]
E0.reg = 4;
E0.tail = true;
# E0.stack = y;
E.code =
beq $0,$0,L2
L1:
ori $3,$ra,0
# addi $sp,$sp,-x
# sw $fp,4($sp)
# addi $fp,$sp,4
# sw $4,4($fp) #
# ... # store arguments
# sw $rn,4*n($fp) #
[E0.code]
L2:
# ori $2,$0,y
# jal L_quirk13_alloc_vec
# ori $r,$2,0
# sw $0,4-ATAG($r)
# [initialize slot 1]
# ...
# [initialize slot m]
jal L_quirk13_alloc_pair
# sw $r,4-PTAG($2)
# ori $r,$2,0
# la $2,L1
# sw $2,0-PTAG($r)
# [returnif(E.tail,E.stack)]
returnif(true,n) stands for the three instructions
ori $2,$4,0
addi $sp,$sp,n
jr $3
(call E0 [E1 ... En])
If E0 is a variable whose lexical address is of the form
<$r1,[...],L>
then there is no explicit closure for the variable being
called. The environment must be computed at the call site.
If E.tail is false, then the usual
[E0.code]
lw $2,0-PTAG($r0)
lw $fp,4-PTAG($r0)
jalr $2
is replaced by
ori $fp,$r1,0
[path(30,[...])]
jal L
If E.tail is true, then there is a subtle danger to contend
with. A tail call ought to pop the stack as for a return,
but the stack can't be popped if the ribs that would be popped
contain variables that are used by the procedure being called.
If E.tail is true and E.stack is zero, then the usual
[E0.code]
lw $2,0-PTAG($r0)
lw $fp,4-PTAG($r0)
ori $ra,$3,0
jr $2
is replaced by
ori $fp,$r1,0
[path(30,[...])]
ori $ra,$3,0
j L
If E.stack is nonzero, then the easiest thing to do is to
compile the call as though E.tail were false, and to follow
the call with a return that pops the stack.
To preserve tail-recursion in some cases when E.stack is nonzero,
the compiler can keep track of how many ribs have been allocated
on the stack as well as how many bytes. If the lexical address
of E0 indicates that it does not need any of the ribs that have
been allocated on the stack by the calling procedure, then it is
safe for the calling procedure to pop the stack before the jump.
Allocating local variables in registers instead of on the stack
helps to avoid this tradeoff between stack allocation and tail
recursion. Lambda lifting also helps.
----------------------------------------------------------------
Advice on implementing compile-time environments.
Don't try to implement compile-time environments by side effect,
and don't worry about storage deallocation. Use a simple but
very general prototype implementation until you know precisely
what kind of compile-time environments are needed for your
optimizer. Representation independence guarantees that you can
then tune your implementation of compile-time environments
without breaking your optimizing compiler.
I represented a compile-time environment as a pointer to a block
of storage laid out as follows:
empty environment --> [ 0, __, __, __, __, __]
A[I:r] --> [ 2, A , I , r , __, 0 ]
A[I:r,L] --> [ 2, A , I , r , __, L ]
A[I:r,t,o] --> [ 3, A , I , r , t , o ]
A[+ r,t,o] --> [ 4, A , __, r , t , o ]
A[I:+,t,o] --> [ 5, A , I , __, t , o ]
To lookup a variable I' in an environment A', proceed by cases
on A':
If A' looks like then return
---------------- -----------
--> [ 0, __, __, __, __, __] error: I' is not bound in A'
--> [ 2, A , I , r , __, 0 ] if I' = I
then <r,[],0>
else lookup (I', A)
--> [ 2, A , I , r , __, L ] if I' = I
then <r,[],L>
else lookup (I', A)
--> [ 3, A , I , r , t , o ] if I' = I
then <r,[t o],0>
else lookup (I', A)
--> [ 4, A , __, r , t , o ] let <r',[P],L> = lookup (I', A)
in
if r = r'
then <r,[t o P],L>
else <r',[P],L>
--> [ 5, A , I , __, t , o ] if I' = I
then let <r',[P],L> = lookup (I', A)
in
<r',[t o P],L>
else lookup (I', A)
----------------------------------------------------------------