Around Methods and Exception Handling in Demeter/Java

Lars T Hansen
March 13, 1997

In a term project for COM3360 I proposed some extensions to Demeter/Java that allow programmers to use Java's exception handling facilities in their Demeter/Java programs. That proposal also introduced the notion of around methods and noted that such methods are powerful, but the proposal did not actually incorporate around methods and did not use them to explain the proposed exception handling methodology.

The present project, in contrast, simplifies the previous proposal by making around methods a fundamental part of Demeter/Java and implementing the exception handling mechanisms in terms of the around methods, with a few modest extensions to the language to accomodate Java's requirement that all methods must declare the exceptions they throw. (These extensions are almost identical to those of the original proposal.)

The resulting design is simpler and probably more easily understandable than the original. Syntactically, I have done away with catcher and thrower methods (they are no longer needed) and have moved the exception annotations into the method header to be more Java-like. Semantically, the protect mechanism is explained in terms of around methods.

A prototype of the present proposal has been implemeted and is being integrated into Demeter/Java.

Contents

1. Around Methods: Introduction

The around method allows you to surround a traversal below any node with arbitrary computation (which may retain state), to do the traversal multiple times, and not to do the traversal at all. It is a generalization of the functionality found in the before and after methods.

Around methods are specified in the same fashion as other visitor methods:

  V {
    around C (@ ... code ... @)
  }
where the code in addition to the host parameter also receives a parameter called subtraversal. The latter parameter is an instance of some implementation class, and has one public method called apply().

Calling subtraversal.apply() starts the traversal below the host node. If you have a traversal-plus-visitor pair: The .cd:

  A = B.
  B = C.
The .beh:
  A { traversal t( V v ) { to C; } };

  V { before B (@ foo; @)
      after B (@ bar; @)
  }
Then you can rewrite the .beh to the following, equivalent, form:
  A { traversal t( Visitor v ) { to C; } };

  V { around B (@ 
        foo;
	subtraversal.apply();
	bar;
      @)
  }

There can now be local variables in the around method for state retained around the subtraversal; in the original code, this state would have to be in V if the before and after methods were to share it. But even shared state in V is less powerful than local state in the around bethod because the local state is local to the method, i.e., in a traversal along a graph that has multiple B's on it, each around method can have its own local state.

The original motivation for around methods was exception handling, because exception handlers on a traversal may need to be wrapped around the subtraversal starting at some class:

  V { around B (@
      try {
	subtraversal.apply();
      }
      catch (MyException E) {
	...
      }
    @)
  }
However, around methods are more generally useful than just for exception handling. They can be used to implicitly keep a stack of information (as outlined above); to control whether a subtraversal should be performed at all; and to perform a subtraversal multiple times. It is possible to control traversals in similar fashions in the current system also, by splitting traversals into multiple traversals. In fact, it seems generally possible to simulate around methods by splitting traversals at the point where the around method would be executed, but such a simulation is naturally non-local. In some sense, the around methods therefore add expressive power to the existing system.

The cost of around methods comes as a decrease in the amount of information about the program that can be determined at compile (expansion) time. Some examples follow.

Information about whether a subtraversal will be performed at all, or multiple times, is hidden inside the around method, which the Demeter/Java system does not analyze. Certainly, the subtraversals are objects that can be stored in data structures, passed to functions, and so on. It is probably possible (exercise left for the reader...) to trick the system into performing some subtraversals at times when a traversal is not underway.

2. Around Methods: Specification

There is an around method like there are now before and after methods. The around method is called after all before methods and receives an extra parameter called subtraversal. That parameter is an instance of some unspecified class, call it AC, and it has a single method, called apply:
  AC.apply : () -> void
Calling subtraversal.apply() performs the traversal rooted at the class for which the around method is specified. When the around method returns, all after methods will be called.

For example,

  around Team (@
    try {
      subtraversal.apply();
    }
    catch (ListTooLongException e) {
      ...
    }
  @)

If there are multiple around methods for the same class in a single visitor, they are nested in top-to-bottom textual order: the first around method is called first, and when it calls apply(), the next around method is entered. (Demeter/Java does not currently allow this.)

If there are multiple visitors each with an around method for a given class, calls to the around methods are nested in left-to-right order in the traversal's parameter list, with each group of around methods in a single visitor acting as a unit.

Around also works on edges.

3. Exception Annotations on Visitor Methods

The syntax of the before, after, and around methods is extended to include an optional throws list that declares the exceptions that may be thrown by the method:
  before Team
    throws ListTooLongException, MajorLeagueException
  (@ ... @)
If there is no throws specification, the method is expected to throw no exceptions. Otherwise, no other exceptions than the ones declared in the throws list may be thrown by the method. The Java compiler will check this; the purpose of the annotations is to make it possible for the Java compiler to perform its checking on the expanded code.

4. Exception handlers on traversals

An extension is made to the traversal whereby an exception handler can be attached to the traversal rather than to the visitor objects:
  Team {
    traversal collectBullpen( BullpenCollector b ) {
      bypassing -> *,disabled,* to Pitcher;
    }
    throws TeamTooLarge
    protect Team ( ListTooLongException e ) throws TeamTooLarge
    (@ throw new TeamTooLarge( ... ); @)
  }
(where the throws list and the protect clause are both optional). The meaning of the above fragment is as follows. The throws list declares the exceptions that may be thrown by the traversal function itself. The protect clause declares an exception handler for the given class on the traversal, in the following sense. A new context (visitor) class is created, and the protect clause is made an around method in that class:
  __Protect_314159 {
    around Team throws TeamTooLarge
    (@ try { 
         subtraversal.apply();
       }
       catch (ListtooLongException e ) {
         throw new TeamTooLarge( ... );
       }
    @)
  }
Then, every traversal function created for the traversal is made to take an instance of this new class as the first argument. More abstractly, the instance becomes an implicit first argument to the traversal. The protect clause can then be removed from the traversal.

(It may be desirable to distinguish between protect outermost -- what is described directly above -- and protect innermost, where the new object goes at the end of the parameter list. This is future work.)

Multiple exception handlers can be merged in a single implicit object in a straightforward way: a class with multiple handlers receives multiple catch handlers. Handlers on other classes receive their own around methods.

5. Implementation

Implementing around

This section describes the current, first implementation of around methods. The implementation is imperfect in many ways and will doubtless be improved upon. In particular, it is not entirely suitable for exception annotations, as noted.

Example

I will use the following example program in this section:

The .cd:

  Datum   : List | Atom .
  List    = "("  List(Datum) ")" .
  Atom    =  Ident.

  Program =  Datum .
  listVisitor = .
  atomVisitor = .

The .beh:

  Program 
  {
    (@ public static void main( String args[] ) throws Exception {
         Program prog = ...;
         prog.allAtoms( new atomVisitor(), new listVisitor() );
       }
    @)
  
    traversal allAtoms( atomVisitor av, listVisitor lv ) { to { Atom }; }
  }
  
  listVisitor {
    around List (@
      System.out.print( "(" );
      subtraversal.apply();
      System.out.print( ")" );
    @)
  }
  
  atomVisitor {
    before Atom (@
      System.out.print( host.get_atom() );
    @)
  }

Translation

The implementation technique strongly resembles the use of closures in functional programming. A call to an around method must pass a subtraversal object of some appropriate type; that object encapsulates not only the subtraversal computation itself but also the host object and the vistitors that are to be passed to the traversal function when the subtraversal starts. So the call to the around method for the List classlooks like this, where lv is a listVisitor instance:
  lv.around( new __AROUND_List_k_C( this, av, lv ), this )
The class __AROUND_List_k_C will be discussed below. (k is just a tag used to keep names unique.) The code generated for the around method is now straightforward:
  public void around( __AROUND_C subtraversal, List host ) {
    System.out.print( "(" );
    subtraversal.apply();
    System.out.print( ")" );
  }    
Note that it is an artifact of the current implementation that the class of a subtraversal is constant and known, and note further that this fact is not part of the specification of the around mechanism. (In fact, in a system where exception annotations have been implemented, the class of the subtraversal in an around method must correspond exactly to the class of the object passed, and there may be multiple around methods for a given host class that differ only in the types of the subtraversal objects that they take. See the section below on exception annotations.)

Every __AROUND_*_C is a subclass of the abstract class __AROUND_C:

  abstract class __AROUND_C {
    public abstract void apply();
  }
In the case of our example, the class __AROUND_List_k_C looks like this:
  class __AROUND_List_k_C extends __AROUND_C {
    List host;
    atomVisitor av;
    listVisitor lv;
    public __AROUND_List_k_C( List host, atomVisitor av, listVisitor lv )
    {
      this.host = host;
      this.av = av;
      this.lv = lv;
    }
    public void apply() { host.__AROUND_List_k( av, lv ); }
  }
As you can see, the body of the apply() method calls a generated method __AROUND_List_k()in the List class. That method implements the actual subtraversal in the context of the host object. In the case of the example, it looks like this:
  public void __AROUND_List_k( atomVisitor av, listVisitor lv ) {
    list.allAtoms_trv( av, lv );
  }
The implementation for edge visitors is very similar. Visitors for abstract classes are harder, however, and their implementation has been put off until class graph flattening has been implemented in Demeter/Java.

Cost

The cost of the implementation comes in two forms: additional clutter in terms of extra classes, and the run-time cost of the "closure creation".

Some classes used in the translation will be generated. There will be one class generated for each around method touched by some traversal, for each traversal that touches the method. In the worst case the number of new classes is the product of the number of traversals and the number of around methods in the program. The worst case is not likely to occur for any but the simplest programs, but the cost is still high. It seems likely that classes can be reused with the judicious use of typecasts: one might then need only one class for each number of visitors. I haven't though carefully about this yet, and in any event, exception annotations may not make this possible.

The run-time cost of class creation and method calls does not seem high: there is one class creation for each call to an around method, but the class is small and the constructor simple. Object allocation and collection need not be expensive, but the actual cost obviously depends on the actual run-time system.

Implementing protect

This is a draft; the protect mechanism has not actually been implemented.

As described so far, the translation of the protect into an around cannot be described directly in Demeter/Java as a source-to-source transformation. This is because of the implicit argument: there are no implicit arguments in Demeter/Java. There are three ways to fix this:
(a) Create the notion of an implicit argument.
(b) Fold the around methods into the first explicit visitor in the parameter list.
(c) Use a level of indirection.

Solution (a) is heavyhanded. Solution (b) breaks down if there are no visitors! This may seem contrived; consider, however, a "consistency check traversal" that will throw a NullPointerException if the parts are not all there as expected.

Solution (c) works like this. Rename the traversal, then introduce a method to take its place that creates the exception visitor and passes it to the traversal:

  Team {
    (@ 
    public void collectBullpen( BullpenCollector b ) {
      this._collectBullpen( new __Protect_314159(), b );
    }
    @)
    traversal _collectBullpen( __Protect_314159, BullpenCollector b ) {
      bypassing -> *,disabled,* to Pitcher;
    }
  }
where __Protect_314159 is the exception handling class described earlier. The upshot of having described a complete source-to-source translation of protect into Demeter/Java without protect is that any protect clauses can be handled by the Demeter/Java front-end and that the overall cost in complexity is low.

6. Computing annotations on generated functions

Each generated function (method) will have to be annotated with the list of exceptions that may escape from the method. These methods include before, after, and around methods but more importantly the traversal helper functions, the apply method in a subtraversal, and other methods that are not visible to the programmer. The algorithm described in this note is believed to be general enough to handle all such functions.

The method is to reduce the problem to a well-studied data flow problem, the Live Variables (LV) problem, for which correct and effective algorithms are known. A discussion of LV can be found in e.g. Zima and Chapman, Supercompilers for Parallel and Vector Computers, ACM Press, 1991. The next section summarizes that treatment. Following sections then phrase the exception annotation problem in terms of LV.

The Live Variables problem

Quoting from Zima and Chapman, p.90:
The live variables problem is the problem of determining, for each node n of a flowgraph, the set of all variables that are live at the end of n. Remember that v is live at the exit of n iff there is a path from n to n' such that there is an outward exposed use of v in n', and the path, except for its endpoints, is definition-free for v.

An "outward exposed use" is a property that is local to a basic block; a variable has an outward exposed use iff the variable is used in the block before it is assigned. Similarly, an "outward exposed definition" for a variable in a block is a definition that is not followed by other definitions of the variable in the block. Define the following functions on basic blocks n:

VAR = { v | v is a variable of the program }
DEF(n) = { v | there is an outward exposed definition of v in n }
USE(n) = { v | there is an outward exposed use of v in n }
PRE(n) = VAR - DEF(n)
Now the effect of a basic block n can be described by a function f:
f(n,X) = (X intersect PRE(n)) union USE(n)
for every X a subset of VAR; that is, given a set X of variables, a variable is live w.r.t. X if it is in X and is preserved by the block, or if it is used in the block (regardless of X).

There exists a straightforward fixed-point algorithm for LV (Algorithm 3.5.5 in the Zima and Chapman book). The algorithm's worst-case complexity on general graphs (with some restrictions that are not a problem for us) is O(N^2) where N is the number of nodes in the graph. On reducible graphs, the algorithm's complexity is better.

Phrasing exception annotations in terms of LV

The Demeter/Java system generates a set of methods for some classes for each traversal it processes. These methods call each other, inducing a call graph. Some nodes in this graph have exceptions annotations: these are the annotations from the user's before, after, and around methods. Consider the following program:
  A = <b> B <d> D .
  B = <c> C .
  C = <b> B .
  D = <c> C .

  A { 
    traversal t( Visitor v ) { to C; } 
  }

  V { around { B, D } throws Fatal
       (@ ... @)
      before C throws Bad
       (@ ... @)
  }
The Demeter/Java system will generate the methods A.t(V), A.t_trv(V), B.t_trv(V), C.t_trv(V), D.t_trv(V), V.around(B), V.around(D), and V.before(C). In addition, there will be two subtraversal classes for the around methods, each with an apply method: call these classes BST and DST, with methods BST.apply() and DST.apply(). We then get a call graph that looks like this, where "->" means "calls" (this needs to be an actual drawing):
    A.t(V)       ->   A.t_trv(V)
    A.t_trv(V)   ->   V.around(B)
                 ->   V.around(D)
    V.around(B)  ->   BST.apply()
    V.around(D)  ->   DST.apply()
    BST.apply()  ->   B.t_trv(V)
    B.t_trv(V)   ->   C.t_trv(V)
    C.t_trv(V)   ->   V.before(C)
                 ->   V.around(B)
    DST.apply()  ->   D.t_trv(V)
    D.t_trv(V)   ->   C.t_trv(V)
We can now annotate some of the methods in the call graph: the methods that have explicit throws lists are annotated with these, in an attribute called exThrows ("explicit throws"). So:
    "A.t(V)".exThrows = {}
    "V.around(B)".exThrows = { Fatal }
    "V.around(D)".exThrows = { Fatal }
    "V.before(C)".exThrows = { Bad }
The problem to be solved is to annotate each of the above methods with the exceptions that the method may throw. (Some study will convince you that V.before(C) throws Bad; that A.t_trv(V), V.around(B), and V.around(D) throw Fatal; and that all others throw both Bad and Fatal. Note also that the Java compiler will flag an error at A.t(V) because it is declared as throwing no exceptions but does not contain code or directives to catch exceptions thrown by A.t_trv(V).) I will demonstrate how this problem may be phrased in terms of LV.

Consider that Fatal is the only exception to escape from V.around(B). It follows that V.around(B) in a sense handles all other exceptions in the program. The same holds for the other annotated methods: only the exceptions in the methods exThrows attribute can escape from the method. Source methods that are unannotated in the source program have exThrows = {} and allow no exceptions to escape: they handle all exceptions that escape from their children in the call graph.

If a method M handles an exception E -- that is to say, some child N of M may throw E but it is known that M may not throw E because E is not in M.exThrows -- and there is no method between M and N that handles E, then each method P between M and N must be annotated as throwing E, because E will pass through P on its way from N to M.

It is not hard to see that this problem is isomorphic to LV. The set U of all exceptions corresponds to VAR. The set n.exThrows corresponds to USE(n), and if n.exThrows is defined for some n, then the set U-n.exThrows corresponds to DEFS(n); otherwise, DEFS(n) is empty.

The objective is to compute n.live for each n for which n.exThrows is not defined, and for each n, this live set will correspond to those exceptions that may escape from n, that is, those exceptions which n has to be declared as throwing. (Whew.)

As far as computational cost is concerned: I do not yet have a good handle on the size of a call graph in a Demeter/Java program, so it's hard to say whether the algorithm's O(N^2) worst-case complexity will be a burden on the running time of the Demeter/Java. The operations performed at each node are simple -- set operations that can use bitvectors for the set representation, with the expected bitvector length being quite short -- so even for a graph with a few hundred nodes, the time to compute the live sets should be acceptable, but this remains to be seen.


lth@ccs.neu.edu