The technical report:

Traversal Semantics in Object Graphs
is a good paper to learn about the precise definition
and semantics of traversal strategies.

Read the above paper before you read the TOPLAS paper below which
contains detailed, efficient algorithms.

Bibtex entry:

@article{973102,
author = {Karl Lieberherr and Boaz Patt-Shamir and Doug Orleans},
title = {Traversals of object structures: Specification and Efficient Implementation},
journal = {ACM Trans. Program. Lang. Syst.},
volume = {26},
number = {2},
year = {2004},
issn = {0164-0925},
pages = {370--412},
doi = {http://doi.acm.org/10.1145/973097.973102},
publisher = {ACM Press},
}

@TECHREPORT{strategies:LP,
AUTHOR       = "Karl J. Lieberherr and Boaz Patt-Shamir",
TITLE        = "{Traversals of Object Structures: Specification and Efficient
Implementation}",
INSTITUTION  = "College of Computer Science, Northeastern University",
YEAR         = 1997,
MONTH        = "July",
NUMBER       = "{NU-CCS-97-15}",
}


This paper significantly advances the concept of succinct traversal specifications and provides an efficient implementation. It both generalizes our earlier traversal models and improves on the corresponding algorithms.

The publically available Java implementation is at: AP-Library.

Older version: ftp://ftp.ccs.neu.edu/pub/people/lieber/strategies.ps

Errata (already incorporated in above version):
Gary Gengo pointed out an omission of an edge in Fig. 3 which
also has implications on Fig. 4.

In Figure 3, subfigure 5 there is a subclass edge missing from
Z in copy 1 to D in copy 2.
The same holds in subfigures 6 and 7.

The missing subclass edge also needs to be added to all subfigures
1-9 in Fig. 4.
In subfigure 3, node D in component 2 needs to also be shaded.
In subfigure 4, node B in component 2 needs to also be shaded.
In subfigure 5, node E in component 2 needs to also be shaded.
In subfigure 4, node B in component 2 needs to be shaded.

======

The earlier papers with Jens Palsberg used the terminiology of
PathSet[G](S): The set of paths in a graph G following traversal
specification S.
The PathSet terminology should be used in this paper rather
than introducing a new one.

Let {\em S}=(S, s, t) be a strategy, let G be a class graph, let N be a
name map for S and G.

PathSet[G](S) =
{ p' | p' in P_G_(N(s),N(t)) and exists p in P_S_(s,t)
such that p' is an expansion of N(p) }
where P_g_(x,y) is the set of paths in graph g
from x to y


Comparison with automata-theoretic algorithms:

The paper introduces two algorithms: Algorithm 1, better called Traversal Graph Algorithm and Algorithm 2, better called Traversal Method Algorithm. An intuitive understanding of those algorithms from an automata-theoretic point of view can be obtained by simplifying the problem and considering only class graphs without subgraph edges. The Traversal Graph Algorithm now becomes the cross-product algorithm for NFAs. See NFA algorithms and AP algorithms for examples. It is necessary to encode both the class graph and the strategy graph in a certain way to see the connection to NFA intersection.

It is important to note that the Traversal Graph Algorithm is a non-trivial application of the NFA intersection algorithm. The Traversal Graph Algorithm deals 1. with general class graphs, 2. with negative information in the strategy graph edges 3. with a controlled minimization of the automaton without resorting to a general purpose automaton minimizer that would destroy the desired structure.

The Traversal Method Algorithm also has a connection to NFA intersection at least from a theoretical point of view: The intersection of the traversal graph and an object graph tells us whether there is any traversal at all. The Traversal Method Algorithm can be viewed as an application of an NFA simulator. However, again there are significant differences. The Traversal Method Algorithm deals with 1. general class graphs, 2. the objects traversed are trees, not strings

Some further motivation which was not included in the paper:

Object or data traversing is an ubiquitous task in applications
which use composite objects or data. Designers and programmers
have informal intents about traversing when they design their
applications. They need to translate their intents into
programs and this can be a surprisingly delicate task.
To facilitate the process of implementing traversals we introduce
a new model to express traversal intent in the form of traversal
strategy graphs. We show how to correctly translate those strategies
into code when the details of the object or data structure are given
in the form of a (class or data structure) graph.

When building a reusable set of classes we must ensure that the
dependencies between the objects and operations are easily customizable
to allow the collective behavior of the objects to be easily
modified.

A useful rule to follow when building \oo applications is the
delegation
rule \cite{LoD-refs}:

(The delegation rule can be viewed as a version of the Law of Demeter

Public methods should only receive support from their
own internal methods or public methods of any class.
Without such a rule, when the associated objects change, the public
operations would have to be modified directly.
By following the rule, the impact of changing the associations
will not directly affect the primary operations of an object.

In this paper we follow the above rule by implementing the public
methods in the following indirect way conforming to the delegation rule:

R f(A a) {
instantiate and initialize visitor objects v1, ... ,vn;
instantiate a traversal strategy object t with respect to
a class diagram G;
this.do_the_work(t,v1, ... ,vn);
return the appropriate value from the results in the visitors;
}

This programming style, which we call the traversal-visitor style,
can be viewed as an application of the visitor design pattern.
The novelty in this paper is the model and implementation
of traversal objects. Traversal objects are specified through
traversal strategies (strategy for short)
which specify the architecture of the traversal.
Notice how object implementations become doubly decoupled from the
structure of the associated objects: The traversal strategy object
specifies loosely how to navigate through the objects
to find the required associated objects {\em and}
the details of the object structure is specified by a class diagram.
the visitors say what needs to be done when certain objects
are visited. Programs written in this traversal-visitor
style are called structure-shy since they can work with many
different class diagrams.

It is important to mention that the traversal-visitor style
can be applied  in a conventional object-oriented environment
without any pre-processors, post-processors, or language
extensions. As a result, the traversal-visitor style of programming
can be used without having to commit to non-standard or
poorly supported tools. Hewlett-Packard has used this approach
successfully for implementing the installation software
for their family of printers.

An example of a strategy is:

from Window to Drawable

which, when combined with a class diagram, will find all drawable objects
contained in a window. If we want only some of the drawable objects,
we use a strategy like:

from Window .
( through Car . to Drawable +
bypassing Model to Drawable)

This strategy will find the union of the drawable objects contained
in cars and the drawable objects not contained in models.
As those examples demonstrate, traversal strategies provide for a
very nice high level model for expressing traversals.

%Summary of applications:

Traversal strategies have many applications beyond the ones
indicated above. In addition to allowing us to significantly
decouple object implementations from object associations,
strategies are useful for specifying object pickling
[SunSoft,crista], object caching [Usenix], object persistence etc.

In summary, this paper makes contributions to three areas:

algorithmic improvement:
The algorithm presented for compiling strategies
runs in polynomial time and also solves
the shortcut, zigzag and
subclass invariance problems. The previous algorithm was exponential.

more expressiveness:
the model supports how did you get there'' information.
We also give a formal treatment for
negative strategies: so far, we treated bypassing and stopping
only informally.

better syntax:
Redundancy has been removed:
[A,B].[B,C](
[C,D].[D,E] +
[C,F].[F,G])

can be written in simpler form:

from A via B via C
(via D to E,
via F to G)

Notice that class C is now mentioned only once; in the
first expression it is mentioned three times.

An example of a positive strategy is:

from {Receipt}
through {Personal}
(
through {Total}
to {Cost : totalCost},
through {LineItem}
to {Cost : lineItemCost}
)

PercentageVisitor {
after LineItem { System.out.println( lineItemCost/totalCost );}
...
}

Prints for each line item in a personal receipt
the fraction of the total cost.
We assume that Total is traversed before LineItem.

Related work is in:

@ARTICLE{mendelzon:siam,
AUTHOR = "Alberto Mendelzon and Peter Wood",
TITLE = "Finding regular simple paths in graph databases",
JOURNAL = "SIAM Journal on Computing",
YEAR = 1995,
PAGES = "1235-1258",
MONTH = "",
VOLUME = 24,
NUMBER = 6
}


Feedback we received on the paper:

Feedback by Yannis Smaragdakis: Postscript or PDF

Some of the feedback from the Journal of the ACM (Robert Harper, CMU, editor) including the authors' response: PDF

Viewgraphs explaining the algorithms of the paper: PowerPoint or PDF

Viewgraphs about theory of traversals: PowerPoint.

Response to Mitch Wand and JACM Reviewers. AP is a sound methodology: Comparing AP and OO.

Feedback from the IEEE Transactions on Software Engineering (Don Batory, editor) including the authors' response: IEEE TSE

Feedback from the ACM Transactions on Programming Languages and Systems (Ron Cytron, editor) including the authors' response: TOPLAS