Optimal Navigation of Objects using Regular Expressions for both Queries and Metadata

Speaker: Karl Lieberherr, Northeastern University, CCIS, Boston

Abstract: Navigation of objects (data) based on structural queries and metadata is a fundamental operation that is a precursor for other information processing tasks performed by programs. Metadata about the structure of data is essential for retrieving information efficiently. It is important to study object navigation in a general setting that turns out to significantly simplify previous approaches in the spirit of Polya's Inventor's Paradox.

We model both structural queries and metadata as regular expressions over an alphabet that corresponds to the data types/classes. The navigation is characterized by a regular expression which has the special property that if the input regular expressions for queries and meta data correspond to DFAs SG (for queries) and CG (for metadata) then the corresponding navigation automaton will be a DFA which has at most SG.states * CG.states states. Several closure properties of regular languages are used to construct the automaton.

The work improves the earlier work of the Demeter group [TOPLAS 1995 and TOPLAS 2004] both conceptually (the query language and its interpretation is practically more useful and more general) and algorithmically (simpler and faster algorithms). The work applies to better implementations of schema-aware XPath and in general to locating subtrees that satisfy some structural pattern.

Joint work with Ahmed Abdelmeged, Therapon Skotiniotis, Mitchell Wand and Pete Manolios.