The technical report: Traversal Semantics in Object Graphs is a good paper to learn about the precise definition and semantics of traversal strategies. Read the above paper before you read the TOPLAS paper below which contains detailed, efficient algorithms. Bibtex entry: @article{973102, author = {Karl Lieberherr and Boaz Patt-Shamir and Doug Orleans}, title = {Traversals of object structures: Specification and Efficient Implementation}, journal = {ACM Trans. Program. Lang. Syst.}, volume = {26}, number = {2}, year = {2004}, issn = {0164-0925}, pages = {370--412}, doi = {http://doi.acm.org/10.1145/973097.973102}, publisher = {ACM Press}, } @TECHREPORT{strategies:LP, AUTHOR = "Karl J. Lieberherr and Boaz Patt-Shamir", TITLE = "{Traversals of Object Structures: Specification and Efficient Implementation}", INSTITUTION = "College of Computer Science, Northeastern University", YEAR = 1997, MONTH = "July", NUMBER = "{NU-CCS-97-15}", ADDRESS = "Boston, MA" }

This paper significantly advances the concept of succinct traversal specifications and provides an efficient implementation. It both generalizes our earlier traversal models and improves on the corresponding algorithms.

The publically available Java implementation is at: AP-Library.

Older version: ftp://ftp.ccs.neu.edu/pub/people/lieber/strategies.ps

Errata (already incorporated in above version): Gary Gengo pointed out an omission of an edge in Fig. 3 which also has implications on Fig. 4. In Figure 3, subfigure 5 there is a subclass edge missing from Z in copy 1 to D in copy 2. The same holds in subfigures 6 and 7. The missing subclass edge also needs to be added to all subfigures 1-9 in Fig. 4. In subfigure 3, node D in component 2 needs to also be shaded. In subfigure 4, node B in component 2 needs to also be shaded. In subfigure 5, node E in component 2 needs to also be shaded. In subfigure 4, node B in component 2 needs to be shaded. ====== The earlier papers with Jens Palsberg used the terminiology of PathSet[G](S): The set of paths in a graph G following traversal specification S. The PathSet terminology should be used in this paper rather than introducing a new one. Let {\em S}=(S, s, t) be a strategy, let G be a class graph, let N be a name map for S and G. PathSet[G](S) = { p' | p' in P_G_(N(s),N(t)) and exists p in P_S_(s,t) such that p' is an expansion of N(p) } where P_g_(x,y) is the set of paths in graph g from x to y

Comparison with automata-theoretic algorithms:

The paper introduces two algorithms: Algorithm 1, better called Traversal Graph Algorithm and Algorithm 2, better called Traversal Method Algorithm. An intuitive understanding of those algorithms from an automata-theoretic point of view can be obtained by simplifying the problem and considering only class graphs without subgraph edges. The Traversal Graph Algorithm now becomes the cross-product algorithm for NFAs. See NFA algorithms and AP algorithms for examples. It is necessary to encode both the class graph and the strategy graph in a certain way to see the connection to NFA intersection.

It is important to note that the Traversal Graph Algorithm is a non-trivial application of the NFA intersection algorithm. The Traversal Graph Algorithm deals 1. with general class graphs, 2. with negative information in the strategy graph edges 3. with a controlled minimization of the automaton without resorting to a general purpose automaton minimizer that would destroy the desired structure.

The Traversal Method Algorithm also has a connection to NFA intersection at least from a theoretical point of view: The intersection of the traversal graph and an object graph tells us whether there is any traversal at all. The Traversal Method Algorithm can be viewed as an application of an NFA simulator. However, again there are significant differences. The Traversal Method Algorithm deals with 1. general class graphs, 2. the objects traversed are trees, not strings

Some further motivation which was not included in the paper:

Object or data traversing is an ubiquitous task in applications which use composite objects or data. Designers and programmers have informal intents about traversing when they design their applications. They need to translate their intents into programs and this can be a surprisingly delicate task. To facilitate the process of implementing traversals we introduce a new model to express traversal intent in the form of traversal strategy graphs. We show how to correctly translate those strategies into code when the details of the object or data structure are given in the form of a (class or data structure) graph. When building a reusable set of classes we must ensure that the dependencies between the objects and operations are easily customizable to allow the collective behavior of the objects to be easily modified. A useful rule to follow when building \oo applications is the delegation rule \cite{LoD-refs}: Public methods should not have direct access to associated objects. (The delegation rule can be viewed as a version of the Law of Demeter which allows access to the immediately associated objects.) Public methods should only receive support from their own internal methods or public methods of any class. Without such a rule, when the associated objects change, the public operations would have to be modified directly. By following the rule, the impact of changing the associations will not directly affect the primary operations of an object. In this paper we follow the above rule by implementing the public methods in the following indirect way conforming to the delegation rule: R f(A a) { instantiate and initialize visitor objects v1, ... ,vn; instantiate a traversal strategy object t with respect to a class diagram G; this.do_the_work(t,v1, ... ,vn); return the appropriate value from the results in the visitors; } This programming style, which we call the traversal-visitor style, can be viewed as an application of the visitor design pattern. The novelty in this paper is the model and implementation of traversal objects. Traversal objects are specified through traversal strategies (strategy for short) which specify the architecture of the traversal. Notice how object implementations become doubly decoupled from the structure of the associated objects: The traversal strategy object specifies loosely how to navigate through the objects to find the required associated objects {\em and} the details of the object structure is specified by a class diagram. In addition, the visitors say what needs to be done when certain objects are visited. Programs written in this traversal-visitor style are called structure-shy since they can work with many different class diagrams. It is important to mention that the traversal-visitor style can be applied in a conventional object-oriented environment without any pre-processors, post-processors, or language extensions. As a result, the traversal-visitor style of programming can be used without having to commit to non-standard or poorly supported tools. Hewlett-Packard has used this approach successfully for implementing the installation software for their family of printers. An example of a strategy is: from Window to Drawable which, when combined with a class diagram, will find all drawable objects contained in a window. If we want only some of the drawable objects, we use a strategy like: from Window . ( through Car . to Drawable + bypassing Model to Drawable) This strategy will find the union of the drawable objects contained in cars and the drawable objects not contained in models. As those examples demonstrate, traversal strategies provide for a very nice high level model for expressing traversals. %Summary of applications: Traversal strategies have many applications beyond the ones indicated above. In addition to allowing us to significantly decouple object implementations from object associations, strategies are useful for specifying object pickling [SunSoft,crista], object caching [Usenix], object persistence etc. In summary, this paper makes contributions to three areas: algorithmic improvement: The algorithm presented for compiling strategies runs in polynomial time and also solves the shortcut, zigzag and subclass invariance problems. The previous algorithm was exponential. more expressiveness: the model supports ``how did you get there'' information. We also give a formal treatment for negative strategies: so far, we treated bypassing and stopping only informally. better syntax: Redundancy has been removed: [A,B].[B,C]( [C,D].[D,E] + [C,F].[F,G]) can be written in simpler form: from A via B via C (via D to E, via F to G) Notice that class C is now mentioned only once; in the first expression it is mentioned three times. We start with an informal discussion of positive strategies. An example of a positive strategy is: from {Receipt} through {Personal} ( through {Total} to {Cost : totalCost}, through {LineItem} to {Cost : lineItemCost} ) PercentageVisitor { after LineItem { System.out.println( lineItemCost/totalCost );} ... } Prints for each line item in a personal receipt the fraction of the total cost. We assume that Total is traversed before LineItem. Related work is in: @ARTICLE{mendelzon:siam, AUTHOR = "Alberto Mendelzon and Peter Wood", TITLE = "Finding regular simple paths in graph databases", JOURNAL = "SIAM Journal on Computing", YEAR = 1995, PAGES = "1235-1258", MONTH = "", VOLUME = 24, NUMBER = 6 }Feedback we received on the paper:

Feedback by Yannis Smaragdakis: Postscript or PDF

Some of the feedback from the Journal of the ACM (Robert Harper, CMU, editor) including the authors' response: PDF

Viewgraphs explaining the algorithms of the paper: PowerPoint or PDF

Viewgraphs about theory of traversals: PowerPoint.

Response to Mitch Wand and JACM Reviewers. AP is a sound methodology: Comparing AP and OO.

Feedback from the IEEE Transactions on Software Engineering (Don Batory, editor) including the authors' response: IEEE TSE

Feedback from the ACM Transactions on Programming Languages and Systems (Ron Cytron, editor) including the authors' response: TOPLAS