Read the Chow and Hennessy paper on priority-based coloring.
We will have a midterm 1 May. Open book, open notes.
The control flow graph for
void ip (array of ref float x, array of ref float y) {
int n = #x;
ref int i = new 0;
ref float sum = new 0.0;
while i < n do {
sum := sum + x[i] * y[i];
i := i + 1
};
sum
};
is
L0:
goto L1
L1:
n := #x;
i := new 0;
sum := new 0.0;
goto L2
L2:
goto L3
L3:
t1 := @i;
t2 := t1 < n;
if t2 then goto L4 else goto L5
L4:
t3 := @sum;
t4 := @i;
t5 := x[t4];
t6 := t3 +% t5;
t7 := @i;
t8 := y[t7];
t9 := t6 +% t8;
@sum := t9;
t10 := @i;
t11 := t10 + 1;
@i := t11;
goto L3
L5:
t12 := @sum;
return t12
Escape analysis for reference values:
i and sum are initialized
by local declarations of the form
ref T I = new E
and they do not escape: They are not
accessible from any function expressions that
escape, and are mentioned only
in load and store (@) contexts.
Hence they are not aliased to anything,
so @i and @sum can be
treated as scalar variables instead of load or
store operations. Replacing i and
sum by their R-values
v_i and v_sum, we get:
L0:
goto L1
L1:
n := #x;
v_i := 0;
v_sum := 0.0;
goto L2
L2:
goto L3
L3:
t1 := v_i;
t2 := t1 < n;
if t2 then goto L4 else goto L5
L4:
t3 := v_sum;
t4 := v_i;
t5 := x[t4];
t6 := @t5;
t7 := t3 +% t6;
t8 := v_i;
t9 := y[t8];
t10 := @t9;
t11 := t7 +% t10;
v_sum := t11;
t12 := v_i;
t13 := t12 + 1;
v_i := t13;
goto L3
L5:
t14 := v_sum;
return t14
L0 uses { } defines { }
L1 uses {x} defines {n, v_i, v_sum}
L2 uses { } defines { }
L3 uses {v_i, n} defines {t1, t2}
L4 uses {v_sum, v_i, x, y} defines {t3, t4, t5, t6, t7,
t8, t9, v_sum, t10,
t11, v_i}
L5 uses {v_sum} defines {t14}
LiveIn(i) = Uses(i) union ( LiveOut(i) - Defines(i) )
LiveOut(i) = Union { LiveIn(j) | i goesto j}
At the return point (on exit from L5), no local variables are live except t14.
LiveIn(0) = LiveOut(0)
LiveIn(1) = {x} union (LiveOut(1) - {n, v_i, v_sum})
LiveIn(2) = LiveOut(2)
LiveIn(3) = {v_i, n} union (LiveOut(3) - {t1, t2})
LiveIn(4) = {v_sum, v_i, x, y}
union (LiveOut(4) - {t3, t4, t5, t6, t7,
t8, t9, v_sum, t10,
t11, v_i})
LiveIn(5) = {v_sum} union (LiveOut(5) - {t14})
LiveOut(0) = LiveIn(1)
LiveOut(1) = LiveIn(2)
LiveOut(2) = LiveIn(3)
LiveOut(3) = LiveIn(4) union LiveIn(5)
LiveOut(4) = LiveIn(3)
LiveOut(5) = {t14}
LiveIn(0) = {x, y}
LiveIn(1) = {x, y}
LiveIn(2) = {x, y, n, v_i, v_sum}
LiveIn(3) = {x, y, n, v_i, v_sum}
LiveIn(4) = {x, y, n, v_i, v_sum}
LiveIn(5) = {v_sum}
LiveOut(0) = {x, y}
LiveOut(1) = {x, y, n, v_i, v_sum}
LiveOut(2) = {x, y, n, v_i, v_sum}
LiveOut(3) = {x, y, n, v_i, v_sum}
LiveOut(4) = {x, y, n, v_i, v_sum}
LiveOut(5) = {t14}
We can use the information provided by liveness analysis to perform global register allocation.
There isn't much else we can do with this example by way of global optimization, because our copying garbage collector requires the address arithmetic for loads and stores involving heap objects to be delayed until run time.
If not for the garbage collector, we could expose some of this addressing arithmetic to optimization:
L0:
goto L1
L1:
n := #x;
v_i := 0;
v_sum := 0.0;
goto L2
L2:
goto L3
L3:
t1 := v_i;
t2 := t1 < n;
if t2 then goto L4 else goto L5
L4:
t3 := v_sum;
t4 := v_i;
u1 := 4 * t4;
t5 := x + u1;
t6 := @t5;
t7 := t3 +% t6;
t8 := v_i;
u2 := 4 * t8;
t9 := y + u2;
t10 := @t9;
t11 := t7 +% t10;
v_sum := t11;
t12 := v_i;
t13 := t12 + 1;
v_i := t13;
goto L3
L5:
t14 := v_sum;
return t14
Local optimization on L3 and L4 yields
L3:
t2 := v_i < n;
if t2 then goto L4 else goto L5
L4:
u1 := 4 * v_i;
t5 := x + u1;
t6 := @t5;
t7 := v_sum +% t6;
t9 := y + u1;
t10 := @t9;
v_sum := t7 +% t10;
v_i := v_i + 1;
goto L3
The L3-L4 loop contains no loop-invariant computations except n, x, and y, but it does have several induction variables.
An induction variable is a variable whose value changes
by a constant amount on each iteration of the loop.
Any variable whose value is a linear function of an
induction variable is also an induction variable.
Proof: Let v be an induction variable,
and let w be a variable such that
w = c1 * v + c2
Let
v' = v + c
w' = c1 * v' + c2
Then
w' = c1 * (v + c) + c2
= c1 * v + c2 + c * c1
= w + (c * c1)
so the new value of w differs from the old
by (c * c1).
We can find induction variables by searching for variables that are assigned exactly once on every path through the loop body by an assignment whose right hand side is the sum or difference of the variable and a constant. We can then find related induction variables by searching for variables that are assigned exactly once on every path by an assignment whose right hand side is a linear function of an induction variable.
For example, v_i is an induction variable.
Therefore u1 is an induction variable, as
are t5 and t9:
v_i := v_i + 1; ; + 1
u1 := 4 * v_i; ; + 4
t5 := x + u1; ; + 4
t9 := y + u1; ; + 4
It may be advantageous to create a new induction variable
with the same value as the old that is initialized in the
loop pre-header and is updated by an assignment of the
form w := w + c.
L2:
vii := v_i;
u1i := 4 * vii;
t5i := x + u1i;
t9i := y + u1i;
goto L3
L3:
t2 := v_i < n;
if t2 then goto L4 else goto L5
L4:
u1 := 4 * v_i;
t5 := x + u1;
t6 := @t5;
t7 := v_sum +% t6;
t9 := y + u1;
t10 := @t9;
v_sum := t7 +% t10;
v_i := v_i + 1;
vii := vii + 1;
u1i := u1i + 4;
t5i := t5i + 4;
t9i := t9i + 4;
goto L3
The new induction variable can be used in place of the old. To prevent an old induction variable that is live on exit from the loop from being updated each time through the loop, it can be assigned at loop exit, using a new block for that purpose if necessary.
L2:
vii := v_i;
u1i := 4 * vii;
t5i := x + u1i;
t9i := y + u1i;
goto L3
L3:
t2 := vii < n;
if t2 then goto L4 else goto L5a
L4:
u1 := 4 * vii; ; now dead
t5 := x + u1i; ; now dead
t6 := @t5i;
t7 := v_sum +% t6;
t9 := y + u1i; ; now dead
t10 := @t9i;
v_sum := t7 +% t10;
v_i := vii + 1; ; now dead
vii := vii + 1;
u1i := u1i + 4;
t5i := t5i + 4;
t9i := t9i + 4;
goto L3
L5a:
v_i := vii; ; dead
u1 := u1i; ; dead
t5 := t5i; ; dead
t9 := t9i; ; dead
goto L5
Uses of an induction variable can be replaced by uses
of a linearly related induction variable.
The induction variables t5i and
t9i are used as addresses, so they
are not good candidates for replacement.
On the other hand, u1i isn't doing
anything useful within the loop itself, and the only
use of vii is to calculate a boolean,
so their uses can be replaced by uses of
t5i.
L2:
vii := v_i; ; now dead
u1i := 4 * vii; ; now dead
t5i := x + u1i;
t9i := y + u1i;
goto L3
L3:
t20 := 4 * n; ; t20 = 4 * n
t21 := t20 + x; ; t21 = x + 4 * n
t2 := t5i < t21; ; (t5i < t21) = (vii < n)
if t2 then goto L4 else goto L5a
L4:
u1 := 4 * vii; ; dead
t5 := x + u1i; ; dead
t6 := @t5i;
t7 := v_sum +% t6;
t9 := y + u1i; ; dead
t10 := @t9i;
v_sum := t7 +% t10;
v_i := vii + 1; ; dead
vii := vii + 1; ; now dead
t22 := t5i - x; ; now dead
u1i := t22 + 4; ; now dead
t5i := t5i + 4;
t9i := t9i + 4;
goto L3
L5a:
v_i := vii; ; dead
u1 := u1i; ; dead
t5 := t5i; ; dead
t9 := t9i; ; dead
goto L5
Reduction in strength then falls out as a byproduct of dead code elimination and hoisting of computations whose values are constant within the loop.
L2:
t5i := x + u1i;
t9i := y + u1i;
t20 := 4 * n;
t21 := t20 + x;
goto L3
L3:
t2 := t5i < t21;
if t2 then goto L4 else goto L5a
L4:
t6 := @t5i;
t7 := v_sum +% t6;
t10 := @t9i;
v_sum := t7 +% t10;
t5i := t5i + 4;
t9i := t9i + 4;
goto L3
L5a:
goto L5