Using Exceptions in Adaptive Programs

Updated December 12, 1996
lth@ccs.neu.edu

This page describes a project for COM 3360: Adaptive Object-Oriented Software Development, Fall 1996. The main project page for the class is here.

The project lives on in an implementation project. Read all about it.

Using Exceptions in Adaptive Programs

Lars Thomas Hansen
Northeastern University

December 12, 1996

1. Introduction

This paper aims to investigate the interaction between exceptions and Adaptive Programming in the context of Demeter/Java. Exceptions are events that occur at particular program points and that can be reacted to at other program points; they are in that sense nonlocal events. This nonlocality, coupled with the requirement of the Java language that methods be declared as throwing a particular exception if a call to the method may "return" by that exception rather than normally, conflicts in interesting ways with the Law of Demeter. The issue at hand is then: how can we write Adaptive (that is, structure-shy, decoupled) Java programs that use these non-structure-shy exceptions?

A second issue centers around how exceptions can be used adaptively, that is, how exception throwing and handling can be decoupled from other computation.

A third issue is this: does AP give us the opportunity to use exceptions in new and exciting ways, or does it perhaps make it easier to use exceptions in traditional ways?

This document introduces exceptions and their use in Java in Section 2. A discussion about the nonlocal nature of exceptions can then be found in Section 3, where the potential conflict between exceptions and the Law of Demeter is also pointed out. Section 4 then describes the purpose of this project in some detail. A simple design for how exceptions can be used with adaptive programs is presented in Section 5, and Section 6 extends this design to one that allows exception behavior as context objects. Section 7 considers some new ways of using exceptions that are made possible by the high abstraction level of Demeter programs. Section 8 discusses the possible implementation of most of the design. Section 9 contains a description of patterns of exception use that I have found by studying existing software in several languages. Finally, Section 10 relates some miscellaneous, peripheral observations that I have made while working on this project and that I have been asked to commit to a disk block.

2. Exceptions in Java

An exception is an unexpected, unusual, or erroneous event that arises in a program during execution. In Java, an exception is mapped onto an object of (a subclass of) the class Exception [really: of the class Throwable, but that's not of interest here], and special language constructs allow the program to react to the exception and inspect the Exception object, and in such a way perform actions to cope with the event. The terminology used by the language is that an exception is said to be thrown when it occurs, and it is said to be caught by the program if the program has chosen to react to the exception.

In Java, the syntax for handling an exception is the following:

   try {
     ... code that might throw an exception ...
   }
   catch (IOException e) {
     ... code that inspects e ...
   }
   ... more catch clauses ...
   finally {
     ... general cleanup code ...
   }

where the catch clauses and the finally clause are optional (although at least one of them must be present). The code in the try block is executed, and if it finishes without any exception being thrown, the code in the finally block is executed. If an exception is thrown during the execution of the try block, however, then that exception is reified as an object of a class derived from the system class Exception, and the first catch clause that matches the type of the exception class is entered. After the catch clause has finished executing, the finally clause is executed.

In addition to exceptions being thrown by system methods (e.g. as a result of I/O errors), programs can themselves throw exceptions using the throw statement:

   throw expression

where the expression must evaluate to an object of an exception class.

The language definition requires that a method be declared with a list of the exception classes that may escape from the method; that is, those exceptions that are thrown by the method itself or by methods called by the method, and which are not handled by the method. For example,

   public int my_method( int n ) 
     throws IOException, IEEEException
   {
     ...
   }

The method exception list becomes part of the interface to the method (and thereby part of the class of which the method is a part) and in fact, it is possible for a Java compiler to check statically whether all exceptions escaping from a method are in fact declared in the exception list.

A method that overrides or hides another method must have an exception list that is a (not necessarily proper) subset of the list in the method that is being overridden or hidden.

The definition of Java exceptions may be found in Gosling [1]. The mechanisms for exception handing in C++ [2] and Modula-3 [3] are similar to the mechanism used by Java. A somewhat dated survey of exception handling mechanisms (covering PL/1, Ada, MESA, and CLU) may be found in the book by Horowitz [4], which also contains a discussion about how exception handling differs from normal returns and the kinds of situations in which exception handling is natural, as well as a summary of the design space of exception handling mechanisms.

3. Exceptions and AP: The Problem

The exception behavior of a program is tied in a deep way to the structure of the program's call graph. An exception thrown at a point A in a program may be caught at a point B that is far away from A, in a different class; the only requirement is that B is a point on the current dynamic call graph to A. From this we see that knowledge of the exception exists in at least two places: at A and at B. However, in Java, knowledge of the exception must exist at every point (method) along the dynamic call graph between B and A. Furthermore, since the escape of an exception (thrown in A) from a method C between B and A is a static attribute of C, the compiler must also assume that any caller of C may end up at A, and hence that all these callers must also either include the exception thrown in A in their exception lists, or handle the exception. Exception declarations are therefore contagious.

One can see from the preceding description that changes to the interface of a method can result in large-scale changes to the source code of a program; for example, if an exception is added to the method's exception list, it will have to be added to the exception lists of all the method's callers, transitively, up to points where the new exception can be handled. Exception declarations are therefore non-local, and programs that use exceptions may be harder to change than those that don't. (In fact, Ellis and Stroustrup [2] report that the throw declarations on functions are optional in C++ for precisely this reason.)

It seems to me that such propagation of static information throughout the program is a flagrant violation of the Law of Demeter.

Some techniques have evolved to deal with this problem. In many cases, a method interface is designed carefully so that the method will throw a generic exception that carries further information about the exception. If the method implementation is changed so that it calls a method that signals an exception not previously handled by the caller, then the caller will handle that exception locally and map it onto the already-existing generic exception class. The generic exception class is already declared as part of the interface of every caller of the original method that does not handle it, so no global changes are needed. The only changes needed are to those methods that actually handle the exception: they need to be extended to handle the new subcase. The following example (in Modula-3, I have yet to find a single Java program that actually uses exceptions in an interesting way) illustrates the point:

    TRY
      frame.fullPathname := FS.GetAbsolutePathname (filename)
    EXCEPT
    | OSError.E (list) => 
        RAISE FormsVBT.Error (RdUtils.FailureText (list))
    END;

Even with the use of this and other techniques for using exceptions, the problem persists: programs that are to be changeable and undergo evolution in the way AP assumes that they will, may not be as adaptable when they are using exceptions as when they are not, since the addition or subtraction of an exception may require global program changes.

4. The Agenda

In the rest of the paper, I will examine three issues in more detail:

How can we use exceptions in our adaptive Demeter/Java programs? Our Demeter/Java programs should support the form of exception annotations on methods that Java requires. A solution to this problem requires judicious extensions to Demeter/Java in the form of syntax, semantics, and an implementation technique.
I see it as an important goal that if we are to annotate the Java programs that are created by Demeter/Java with information about which exceptions pass through which program points, then this information should be as precise as we can reasonably make it, and the design must reflect this.
A proposal for such a design is presented in Section 5.
How can we use exceptions "adaptively"? Traditionally, the exception behavior of a program has been viewed as an addition to the "core" computational behavior. There are clear advantages to such a view: the program does not become cluttered with error handling code; only the point of the exception and the point of the exception handler need know about the error. (This ignores the method annotations.) Can we take this one step further and remove even the point of the exception and the point of the exception handler from the program, for example by packaging the two program points in a context object that is passed to a traversal?
Section 6 discusses these ideas in some detail and also proposes a design for exception contexts.
Does Adaptive Programming or its succinct subgraph specifications let us use exceptions in new and exciting ways? For example, can we add exception handlers to edges of the call graph, or can we specify exception handlers for many points in a subgraph of the call graph succinctly?
Section 7 contains a discussion of these issues.

5. Using Exceptions with AP

This section proposes a design that allows exceptions to be used in a traditional manner in Demeter/Java while maintaining some precision in the declaration list of each method. Sections 6 and 7 discuss more sophisticated uses of exceptions in the context of AP.

What I mean by "traditional" exception usage is the following. Exceptions are thrown in procedures and caught in other procedures; in particular, exception-throwing code and exception-handling code is woven into computational code. Exceptions are represented as (user-defined) objects and carry data. Programming techniques of the type exemplified by Barrow's rules will be used when writing exception-handling code.

There are three aspects to the current design: before and after method annotations, exception handlers in and annotations on traversals, and a methodology for dealing with verbatim methods. Sections 5.1 through 5.3 discuss these aspects. Section 5.4 discusses edge visitors. Section 5.5. shows how new exception classes can be declared in the Class Dictionary. Section 5.6 summarizes the extensions, and section 5.7 lists some design ideas that were considered but that are not part of the proposal.

5.1. Methods annotations

By having the programmer annotate the before and after methods of visitors with exceptions that may be thrown by the method code, it will possible for Demeter/Java to annotate the traversal methods with the right exception lists. The syntax may be the following (which is chosen because it is quite Java-like):

    Team {
	traversal collectBullpen(BullpenCollector b) {
	    bypassing -> *,disabled,* to Pitcher;
	}
    }

    BullpenCollector {
	before Pitcher (@ pitchers = pitchers.append(host)); @)
        throws ListTooLongException;
    }

If pitchers.append() may throw ListTooLongException, then the caller of that method must either catch it or declare it as being thrown; in this example, it is declared as being thrown.

5.2. Catching exceptions in traversals

Assume that the traversal does not want to let the ListTooLongException escape, but wants to raise the a TeamTooLarge exception instead. Then the exception must be handled in the traversal function and converted. We provide a notation for annotating a traversal with a handler using the protect clause:

    Team {
	traversal collectBullpen(BullpenCollector b) {
	    bypassing -> *,disabled,* to Pitcher;
	} 
        protect Team (ListTooLongException e)
            (@ throw new TeamTooLarge( "collectBullpen" + 
                                       b.collectorName() ); 
             @)
        throws TeamTooLarge;
    }

    BullpenCollector {
	before Pitcher (@ pitchers = pitchers.append(host)); @)
        throws ListTooLongException;
    }

Here, the protect clause is an attribute on the traversal; the class name used could be any class name along the path from Team to Pitcher. The meaning is that if the ListTooLongException is thrown along any path of the traversal below Team, then that exception must be caught in the Team class, and the handler code to be executed is as is given.

There can be multiple protect clauses on a traversal.

A traversal may throw more exceptions than are thrown by the visitors' before and after methods, as is the case in this example, and that's what the throws clause on the traversal is for: it is a list of all the exceptions that may be thrown by the exception handler. (Hence, one throws goes with one protect, although it is optional. Possibly, the throws clause should go before the body, not after, to be even more Java-like.)

Notice the use of the visitor "b" in the code block of the protect clause. All traversal contexts are also parameters to the exception handler code and it is an error if the name to which the exception object is bound during handling conflicts with any context name for the traversal.

The keyword protect was chosen because the protect clause behaves differently from the Java catch clause; catch was otherwise the obvious candidate. I'm not dogmatic about this.

5.3. Verbatim methods

Currently, verbatim methods (that is, not before or after methods) in Demeter/Java are somewhat unstructured, being present simply as opaque code blocks between (@ and @). Even though the exception behaviors of these methods are known (they are declared in the method headers), they are not known to Demeter/Java. However, Demeter/Java does not need to know, if we require that before and after methods that call these methods declare as escaping any exceptions thrown by the verbatim methods and not caught by the before and after methods.

5.4. Exceptions in edge visitors

At issue here is whether throwing or catching exceptions in edge visitors is any different from throwing or catching exceptions in vertex visitors (section 5). Demeter/Java does not have edge visitors, but based on the specification of the edge wrappers of Demeter/C++, I don't foresee any problems with throwing exceptions. Catching exceptions on edges, however, is an interesting issue and will be discussed in section 7.

5.5. Declaring exception objects

Since exceptions are represented as objects, it is natural to want to declare the classes for the exception objects in the Demeter class dictionary. Herein lies a minor issue in that all exception classes inherit from a predefined class, Exception, which is not in the class dictionary. The way to define classes in the class dictionary that extend existing system classes is to use the *extends* keyword:

    TeamTooLargeException = <info> MyInfo *extends* Exception .

where the additional part objects are optional, of course. (Demeter/Java already supports this functionality.)

5.6. Extensions to Demeter/Java

The following changes to the DemJava 0.3.7 grammar are necessary, where Wrapper and Traversal are existing productions that have been amended, and the others are new:

Wrapper        : Before | After 
                 *common* <hosts> HostSpec JavaCode
                 [ ThrowsList ].

ThrowsList     =  "throws" Commalist(ClassName) ";"

Traversal      = "traversal" TraversalName TraversalArgs "{" *l
	           + PathDirective ";" - *l
	         "}" [ <protect> ProtectClauses ].

ProtectClauses = List(ProtectClause) .
ProtectClause  = "protect" CatchBody [ Finally ].
CatchBody      = ClassGlobSpec "(" ClassName ExceptionName ")"
                 JavaCode 
                 [ ThrowsList ].
Finally        = "finally" CatchBody .

ExceptionName = <name> DemIdent.

The meaning of these constructs has already been specified.

5.7. Alternatives

The design alternatives in this section were considered but not made part of the current proposal. They are thereby not lesser, only different.

Like there are before and after methods in visitors, there could be an around method. The around method would receive an object in a parameter named thunk that contains a method apply() that takes no arguments. The around method would perform some actions, then call the apply() method, then perform some cleanup. The apply() method implements the subtraversal on the object. This functionality is not available with before and after methods; for example, the around method can set up an exception handler around the subtraversal.
Arguably, this is a more general functionality than is currently provided by the protect mechanism on the traversal, and I think it's worthwhile to consider an around mechanism in the future.
For example,
```
    BullpenCollector {
      around Team 
      // parameters: (SomeThunkType thunk, Team host) 
      (@ try { 
           thunk.apply(); 
         } catch (ListTooLongException e) {
           throw new TeamTooLarge( ... );
         }
      @)
      throws TeamTooLarge;
    }
```
At this time, it is unclear how the Thunk class would be declared and how the exception annotations on the apply() method would be generated, although I believe it can be done. There can be many Thunk types, however, since the program has "no" need of manipulating them, and that may save us.
Move the protect clause into the visitor. This would not really accomplish anything except move it out of the traversal; there are advantages and disadvantages to this. For example, different visitors could have different exception handling behavior but be used with the same traversal. On the other hand, with the protect in the traversal, it only needs to be specified once for many different visitors.
The around methods are more general than this.

6. Making Exceptions Adaptive

The design presented in section 5 allows a programmer to write adaptive programs that use exceptions in a conventional manner. This section will discuss how exceptions can be made to behave more adaptively, that is, how exception behavior can be decoupled from the rest of the computational behavior.

The basic idea is that exception behavior (throwing and catching) can be packaged as a context object and passed as a parameter to a traversal.

6.1. The Idea

The assumption on which this discussion is based is that computations have some sort of core behavior that does not have "exceptional" characteristica, i.e., there is a defined computation that will forge ahead and that will either finish normally or be stopped by a "concurrent" computation that discovers that the main computation is in error. This kind of behavior is exemplified by a program where a traversal that computes a total based on fields in an object structure becomes in error and should signal an exception if the total turns negative. This can be implemented using two visitors where one has access to the internals of the other, and will fetch and check the total after each step and throw an exception if appropriate.

The above assumption is under no circumstances uncontroversial. In the given example, I suspect that the exception throwing is in fact part of the core behavior, and I further suspect that any time an exception is caused not by program errors or by external causes, it will be part of the core behavior. Therefore, a separate exception context may not be a natural or comprehensible programming technique. Furthermore, it seems like an unnatural or pointless solution for when exceptions originate from external sources.

The reason why I will still discuss (entertain? :-) the idea is that it does lead to an interesting decoupling of exception behavior (both throwing and catching) from other computational behavior; this may lead to a higher degree of reuse and a lower amount of duplicated code but is in any case a novel view of exceptions, as far as I know.

6.2. The Design

The example of the previous section can be written as follows, using the design of section 5:

    Company {
	traversal computeSalary( SummingVisitor s ) 
          { to Salary; }
        protect (TotalException e) 
          (@ s.total = -INFINITY; @)
    }

    SummingVisitor {
	before Salary 
         (@ total += host.get_salary();
            if (total < 0) throw new TotalException();
          @)
        throws TotalException;
    }

Notice how the exception code and the computational code are woven together. It might be useful to parameterize the traversal with an additional context object that carries the exception code. The following has been suggested (although with a different syntax) by Karl Lieberherr:

    Company {
	traversal computeSalary( SummingVisitor s, 
                                 ExceptionContext e )
          { to Salary; }
    }

    SummingVisitor {
      before Salary (@ total += host.get_salary(); @)
    }

    ExceptionContext {
      thrower Salary
        (@ if (s.total < 0) throw new TotalException(); @)
      throws TotalException;
      catcher Company 
        catch (TotalException e) 
         (@ s.total = -INFINITY; @)
    }

(The preceding code assumes that the SummingVisitor s is stored in the ExceptionContext in the usual way.) The following extensions to the Demeter/Java grammar define the syntax of the full construct. The production Wrapper has been extended w.r.t. the definition in section 5.6 in order to accomodate the catcher keyword, which differs from the other wrappers.

Wrapper       : SimpleWrapper | CatchWrapper.
SimpleWrapper : Before | After | Thrower 
                *common* <hosts> HostSpec 
                JavaCode
                [ ThrowsList ].
CatchWrapper  = Catcher <hosts> HostSpec "catch" CatchBody.

Thrower       = "thrower".
Catcher       = "catcher".

The semantics of exception contexts are the following. The thrower method is run before traversals into part objects of the host, after the before methods of context objects preceding the exception context in the traversal's parameter list have been run, and before the before methods of context objects following it in that parameter list. If the thrower does not throw an exception, it is for all intents and purposes a regular before method. The catcher method is run if any traversal to part objects from an instance of the class named by catcher ("Company" in the above example) throws an exception of the type named ("TotalException") or of a subclass of that type.

Let's analyze the advantages of the division of core behavior and exception behavior. While the exception context is tied to the SummingVisitor (a tie that could be loosened somewhat with proper use of subclassing), the traversal and the SummingVisitor are both seemingly independent of the ExceptionContext.

If exception contexts can be subclassed, then many different exception behaviors can be passed to the traversal. The traversal specification can be reused for each case, and the SummingVisitor can be used in many different contexts. Subclassing exception contexts may not be straightforward; the issues surrounding such subclassing are discussed in section 6.3.

Even if subclassing is not an alternative, there are other techniques that can be used to achieve the same result, to a point. For some traversals, one can construct exception contexts that are the union of other contexts (for different traversals) and use the union context for several traversals. In other cases, one can use delegation: there is one exception context that is used to determine at which classes throwing and catching should be done, but all the logic could be confined to methods of a delegate object installed in the context object. This is not nearly as nice as subclassing, but it works.

6.3. Subclassing Exception Contexts

How would we allow subclassing of ExceptionContext? Bear in mind that due to the structure of the Demeter system, it is not in general possible for Demeter to discover at compile time what kind of ExceptionContext subclass that is being passed to the traversal at run time (this information is hidden in source code that Demeter does not analyze). However, these problems are similar to the ones faced in implementing the before and after methods. The thrower clause is essentially a before method and the catcher clause is a wrapper around a subtraversal in some particular class (an around method in terms of section 5.6). Therefore, from an implementation perspective, subclassing exception contexts is no worse than subclassing regular context objects (with an around method).

In the current version of Demeter/Java, there are restrictions on how visitors can be subclassed. For example, the following (pseudo-code) is legal but does not do what you expect:

    Visitor {
    }

    Visitor1 extends Visitor {
      before Class1 (@ ... @)
    }

    Visitor2 extends Visitor {
      before Class2 (@ ... @)
    }

If a traversal is declared as taking a Visitor object, both Visitor1 and Visitor2 can be passed to the traversal, but the before methods will not be called. Instead, Visitor must be defined in the following manner:

    Visitor {
      before Class1 (@ @) // empty
      before Class2 (@ @) // empty
    }

(It is my understanding that this restriction will be removed in the future.) In the case of visitors, the workaround probably works fine. For exception contexts, however, we must be more careful. For reasons outlined above, Demeter cannot know (at compile-time) which exact subtype of an exception context is used with a particular traversal. Therefore, exception throwers and catchers should also behave like virtual methods. Consider:

    EC {
      catcher Class1 
        catch (Exception1 e) (@ @) // empty
    }

    EC1 extends EC {
      catcher Class1 
        catch (Exception1 e) (@ ... @)
    }

This form of subclassing (where the handlers are virtual) should be sufficient to implement exception context hierarchies where the various pieces have very different behavior, in the same sense that the visitors in the previous example applied to different classes. Consider:

    EC {
      catcher Class1a 
        catch (Exception1 e) (@ throw e; @) 
      thrower Class1b (@ @) throws Exception1;
      catcher Class2a 
        catch (Exception2 e) (@ throw e; @)
      thrower Class2b (@ @)  throws Exception2;
    }

    EC1 extends EC {
      catcher Class1a
        catch (Exception1 e) (@ bomb(); @)
      thrower Class1b 
        (@ throw new Exception1; @)
        throws Exception1;
    }

    EC2 extends EC {
      catcher Class2a
        catch (Exception2 e) (@ bomb(); @)
      thrower Class2b 
        (@ throw new Exception2; @)
        throws Exception2;
    }

Notice the use of the statement throw e in the body of the catchers in the class EC. This is necessary for orderly propagation of the exception in the absence of an actual handler for the given exception; for example, if EC2 is used and Exception1 is thrown by the Visitor, then that exception must not be dropped on the floor by the empty handler in the base class.

(Exception1 can indeed be thrown by the Visitor (or one of its subclasses!) without declaring this fact in the exception context -- the use of exception contexts does not preclude the use of traditional exception programming behavior.)

6.4. Discussion

There are some problems with the design presented in Section 6, and this subsection touches upon these.

It is possible that the thrower might better be implemented simply as a before method, which would also allow the programmer to use an after method; I don't think there would be any problem with this but I have yet to investigate it carefully.
If we don't throw away the thrower altogether, we should proably allow throwers to be specified as "before" or "after" methods, for comprehensiveness. Whether this is done with a double-keyword setup or with mega-keywords of the form thrower_before is less important.

7. New and Exciting Uses of Exceptions

This section discusses one novel aspects of exception handling that is expressible in an AP system: putting exception handling on the edges of a traversal. This section is speculative -- at this point in time, I do not wish to commit myself to this design.

A suggestion was made (by Mitch Wand) that in addition to adding information to an exception object in some procedures as an exception propagates back through the call graph, one might also want to add information to the object as the exception propagates back through some edges. This would require adding exception handlers to the edges of the call graph, which as far as I know is a not previously implemented idea. It's not hard to simulate edge handlers with handlers in the vertices (procedures), however, but I suspect they're not usually thought of as edge handlers.

Using the design from section 5, we could envision something like the following:

    Team {
	traversal collectBullpen(BullpenCollector b) {
	    bypassing -> *,disabled,* to Pitcher;
	} 
        protect -> *,active,* (ListTooLongException e)
            (@ throw new TeamTooLarge( "collectBullpen" + 
                                       b.collectorName() ); 
             @)
        throws TeamTooLarge;
    }

The meaning here is that if a ListTooLongException propagates back through an "active" edge, then it should be caught and the new exception should be thrown. As usual, actual class names can be used in place of the "*" as the source and sink of the edge.

To make this truly workable, it would be necessary to come up with a scheme for allowing the handler access to the source and sink objects, in the cases where their types are determinable. I leave this as future work.

8. Implementation Issues

8.1. Method and traversal annotations

The basic extension to Demeter/Java involves throws annotations on the wrappers and the protect clause on traversals. The purpose of these annotations is to make it possible to annotate each method declaration in the Java program with the exceptions that might be thrown by the method, as well as to set up exception handlers at certain points in the traversal. In this section, I outline efficient algorithms and implementation techniques that perform these tasks. (I consider the full algorithm definitions and the proofs of their correctness outside the scope of this term paper.)

As far as exception handlers on classes are concerned, it is easy to see that the protect clause tells the Demeter/Java system exactly which classes handlers need to be set up in. Each class will have a traversal method for every traversal. It is therefore possible to set up the exception handler by emitting the traversal method with the otherwise-generated method body wrapped in a try-catch block.

As for annotations on the methods, they can be computed using standard flow-graph operations, by considering the object graph to be a flow graph and the classes to be procedures. Sets of exceptions are analogous to variable definitions (assignments), and exception handlers are the last "uses" of the exception. The exception ranges are then analogous to live ranges; an exception is live at a node if it is live at any callees of the node and not handled in the node. An edge wrapper can be handled by splitting the edge in two and adding a node to join the two parts.

8.2. Exception contexts

8.2.1 Analysis and annotation

I have already demonstrated an informal mapping between the basic design and the exception contexts. Here is a slightly more formal mapping: consider a traversal T and an exception context E used by T, with E.t the set of throwers and E.c the set of catchers. Produce a new traversal T' where there are additional protect clauses, one for each element of E.c. Produce a new context E' where the E.t have been converted to before methods and the E.c are removed altogether. Now, whereever T and E are used, use T' and E' instead.

8.2.2. Implementation

Since exception contexts can be subclassed as outlined earlier, we cannot simply textually substitute the catch clause from the context into a protect clause in the traversal; that trick was only valid for analysis because the method body did not matter. However, although the actual behavior when the exception has been caught might vary, the programmer has no freedom in specifying whether the exception will be caught: that is determined by the basic exception context independent of any derived contexts. Therefore, a variation on the trick from the previous section can be used. Again, create a T' and an E', but let the protect clauses on T' set up the exception handler and then call a secret method in E' that implements the handler code. The thrower methods of E become before methods in E', and the catcher methods of E become secret exception handling methods in E'.

8.3. Handlers on call graph edges

If a traversal uses a context that has an edge handler (something I have not specified, by the way), or the protect clause of the traversal has an edge handler, then that edge handler can be implemented in the following manner.

Given classes C1 and C2 and edge E from C1 to C2 and an exception handler H on that E, the traversal will call from some method C1.n to some method C2.m to perform the traversal rooted at C2. However, every other edge from some Ci to C2 will also go along edge E to C2.m, but we don't want to put handlers on those edges, so we can't put the handler in C2.m.

The handler H can be put in C1.n, around the call to C2.m; this is probably the simplest. The handler will want to have access to both objects, in some circumstances at least, and both are available from C1. (In fact, one can argue that H has to go in C1.n: if the exception is a NullPointerException that is raised because C2 is not there, then the programmer will still want to catch it on that edge.)

9. Patterns of Exception Use

I have looked at some Java and Modula-3 programs to find out how exceptions are typically used. These are some preliminary observations; it is a broad topic and a full investigation is a massive undertaking, not the least because it involves reading the source code for fairly substantial programs. Still, I'm fairly confident that the patterns that are listed below dominate.

9.1. Patterns

Aggregation and abstraction of exceptions: low-level exceptions are caught early on and re-thrown as higher-level, application-specific exceptions; this was illustrated earlier.
Embedded inside large procedures: most code I have looked at is not at all visitor-like (small, simple bodies), being instead functionally complex. Exception handling code is often tucked into the middle of the procedures' bodies, around calls to low-level procedures. In a way, this handling code is very similar to exception handlers on edges.

Nested try blocks: a pattern that I see now and again is the following, where nested try blocks are embedded in a large procedure:

  ...
  try {
    ...
    try {
     ...
    }
    catch (SomeException e) {
     ...
    }
    ...
  }
  catch (SomeOtherException e) {
   ...
  }

Exceptions are external: most exceptions that are not high-level "abstract" exceptions originate from external causes; programs rarely discover such errors in the data that an exception is necessary. Input data format errors are sometimes handled with exceptions.

9.2. Specimens

So far I have looked at the following systems:

"Jigsaw" is an HTTP server from the w3 consortium, written entirely in Java. It consists of about 60,000 lines of source code.
The DEC SRC Modula-3 libraries, run-time system, and bundled applications consist of about 13MB of source code.

10. Miscellany

In this section, I discuss two observations that are somewhat peripheral to the paper as a whole but that I have been asked to write down. The first observation is that there seems to be a connection between programming style and the programming tools available. This is hardly surprising (for example, compare programming styles in functional, logic, and imperative languages) but there are interesting effects of this when using OO design patterns with "standard" OO languages. The second observation is that one can create fast traversals at run-time by using a run-time representation of the class graph in a clever way.

10.1. An observation about programming style

When I was studying large Modula-3 programs in my search for patterns of exception usage, I made the following observation. Most of the programs I looked at (from the DEC SRC Modula-3 distribution, some 13MB of source code), although using the object-oriented features of Modula-3, were not written in what I would call a heavily object-oriented style, e.g., using visitor patterns and similar structure-shy programming styles based on pervasive use of virtual functions. Rather, data structures used virtual functions in their public interfaces but the module implementations contained (many) private stand-alone functions.

Here are some speculations about the reasons for this programming style. First, some of the programs are relatively old, dating from the early '90s; they may predate many innovations in style rules for OO programming. Second, a number of the programmers are old Modula-2+ hands; my impression of Modula-2+ is that it is not a particularly object-oriented language. Third, the class hierarchies were shallow and the code was complex, so the programs were emphasizing computation over data representation and hence perhaps somewhat non-OO. Fourth, and in my opinion most important, however, is that anyone who has tried to write a visitor-pattern based OO program by hand for a nontrivial class hierarchy will know that it is simply not a natural way of writing a program.

Complete use of visitors makes it a real pain both to write the program and to understand the resulting flow of control -- like unrestricted use of higher-order functions, it is simply not a good way of writing programs that are to be read by humans. The visitor pattern introduces levels of indirection in the program and these impede construction and understanding. The use of tools like Demeter certainly makes it easier to write visitor-based programs, and in my very limited experience it also makes it possible to read and understand those programs. Such tools are therefore in my estimation not only aids in construcing some kinds of OO programs, but fundamentally necessary.

10.2. Run-time visitors

Consider the problem of constructing a customized traversal at run-time. The traversal specification will be created as a string and passed to some sort of traversal compiler and come back as an object that represents the traversal. This object can then be paired with an instance of some class in the program's class hierarchy and the traversal can be performed. (The mechanisms for the actual traversal are not given.)

Assume that it is important that the traversal is reasonably fast. Then the problem becomes to create at run-time a program that performs the traversal efficiently. There are several ways of doing this:

If Java had "eval" this might be straightforward, but Java does not have "eval". (I believe, however, that it is possible to simulate "eval" well enough for our purposed here but that's an issue I'll leave alone in this note. Suffice it to say, that since traversals are pretty restricted programs and Java bytecode is portable, we could create a program that compiles a traversal by writing bytecodes that perform the traversal to a .class file, that could then be loaded into the running Java system. Let your imagination take over.)
We can have an interpreter that takes as its input an object graph and a traversal object containing traversal instructions, and which interprets these instructions and performs the traversal. The only necessary assumption seems to be that we require that the Demeter system create a generalized interpreter procedure that can cope with all object types. For example, the interpreter could be encoded as methods in the program's objects, being passed an argument that encodes which edges to traverse.
Similarly, can have a general traversal procedure that is somehow specialized for each traversal and that takes as its input an object graph and traverses it, without interpretation.

The method I will be describing in this note has the flavor of an interpreter interpreting traversal instructions, but at the same time behaves very much like a specialized traversal procedure.

Consider having an interpreter for all the program's object graphs; such a procedure or set of procedures can be constructed by the Demeter/Java system, as I have already argued, and it can be directed by instructions as to which edges to traverse and which edges to leave alone. The trick is to find the right instruction set, the right encoding, and a fast interpretation mechanism.

The instruction set consisting of the two instructions YES and NO is pretty small, and it's the one I'll be using.

The encoding is also simple: any object reference means YES, and a null object reference means NO.

The trick is in the interpreter. The interpreter is not the object graph itself, as suggested above, but a meta-graph or prototypical graph: a graph that models the class hierarchy specified in the Class Dictionary. Consider this grammar:

  A = <b> B <c> C.
  B = <d> D <e> E.
  C = <f> F.
  D = <b> B <g> G.
  E = .
  F = <h> H.
  G = .

The grammar can be modeled by an object graph that has objects that correspond to the classes, with part-objects that correspond to the edges. There will be seven objects in the graph, one for each of A..G. The graph may be circular, and for the above grammar it will.

A traversal is a specification of a subgraph of this graph. For example, the traversal from A to G goes through B and D, and includes a back-edge from D to B. More importantly, the <c> edge in the A object is nil, as is the edge <e>. There are A, B, D, and G objects in the subgraph.

Now let the subgraph be the interpreter, and let each object X in the subgraph have a method rttrav that takes as its input an object of the class for which X is the proto-object; e.g., protoA.rttrav() takes as input an object of class A. The body of protoA.rttrav() will then look like this:

  void rttrav( A a, Visitor v ) {
    if (b != null) b.rttrav( a.get_b(), v );
    if (c != null) c.rttrav( a.get_c(), v );
  }

The interpreter overhead is thus a load and compare-with-null of an instance variable, per edge in the actual object graph. If the instruction is YES, that is, non-null, then the edge is there, and if the instruction is NO, that is, null, then the edge is not there.

Now, how do we implement the before and after methods? One possibility is to make the Demeter/Java system add two methods to each visitor object, called rtbefore and rtafter, which take a magic integer and an object and switch on the integer, invoking the right method (if any):

  void rttrav( A a, Visitor v ) {
    if (a.before) v.rtbefore( NAME_A, this );
    if (b != NULL) b.rttrav( a.get_b() );
    if (c != NULL) c.rttrav( a.get_c() );
    if (a.after) v.rtafter( "A", this );
  }

The rtbefore method might look like this:

  void rtbefore( int name, Object ) {
    switch (name) {
     NAME_A : this.before_A( Object ); break;
     NAME_B : this.before_B( Object ); break;
    }
  }

When the traversal is constructed, the before and after methods in the proto-objects are set to true or false as appropriate. (Arbitrary visitors cannot be constructed at run-time.)

Some issues have been tossed around but are still resolved.

The specification so far does not deal with some subtle traversal rules, as in the "A to D via B" problem on the graph ABCD. This may be solvable by unrolling the proto-graph once and duplicating some objects, but I'm worried about an explosion in the number of states. A different strategy uses the moral equivalent of self-modifying code: the edge C-D would initially be null; the traversal code in B would create it. However, this involves adding at least one more load-and-compare to each traversal method body.
How do we deal with cycles in the object graph, if at all? The current system does not, and I don't see any reason to.
How do we deal with abstract classes? I don't really see any problems here; the abstract classes can be abstract in the proto-graph, or can be folded into the proto-objects for the concrete classes.
It's unclear how "around" methods would work, if at all, and exception handling does not work well with run-time traversals.

Acknowledgements

Doug Orleans answered a number of questions about both principles and implementation, and the ideas from both Karl Lieberherr and Mitch Wand (which came to me via mail messages forwarded to me) were useful starting points for some of the designs.

References

[1] James Gosling, Bill Joy, Guy Steele, The Java Language Specification. Addison-Wesley, 1996.
[2] Margaret A. Ellis, Bjarne Stroustrup, The Annotated C++ Reference Manual. Addison-Wesley, 1990.
[3] Greg Nelson, ed., Systems Programming with Modula-3. Prentice Hall, 1991.
[4] Ellis Horowitz, Fundamentals of Programming Languages. Computer Science Press, 1983.