vldb2005
Abstract
XPath is the defacto navigation language for XML documents conforming to DTD
and it is important to
have efficient evaluation and checking techniques available for XPath.
Traversal strategies, a component of Adaptive Programming,
are a well studied navigation language for objects
conforming to class graphs and
an efficient evaluation technique has been developed in the previous
millenium. Traversal strategies
don't deal with the full generality of XPath but instead they focus on
ancestor-descendent (ad) edges. Indeed, a traversal strategy is a
graph consisting only of ad edges with optional negative constraints on the edges.
While it has been widely acknowledged in the data base community
that DTD may speed up XPath evaluation, no paper has shown an exponential improvement
by using the DTDs. We show an infinite sequence of XPath expression/DTD/document
triples so that
evaluation using the meta information in the DTDs is exponentially faster.
We draw a strong connection between traversal strategy evaluation and
XPath evaluation and show how this leads to
comprehensive logical optimization techniques for XPath.
Our evaluation is optimal in the sense that by visiting one node less than
our approach visits would lead to wrong results.
Finally, we present empirical results of our speed up and explore various
classes of XML documents that can benefit from this type of evaluation.
We believe that the automata-theoretic
algorithmic ideas presented in this paper should be
an important building block of any efficient XPath evaluator.
The paper:
[pdf]
[ps.gz]
[src]
Last built: Sat May 13 13:06:49 EDT 2006 by jpalm on denali