Demeter ideas are used in the XML domains described below. From one point of view, Demeter is an early generalization of XML that was developed years before XML was born. Demeter is not only for mark-up languages but for the more general LL(1) languages which beat mark-up languages in terms of robustness yet they are almost as easy to parse and also support a bijection between trees and sentences. For recent (2005) related work on this topic, see: Data Description Languages.. The lessons learned with Demeter are applicable to XML. Demeter focusses on data binding and on efficiently traversing objects when meta information is available as a schema. It is easy to show that traversing of objects may become exponentially faster when the schema information is actually used the right way.
Discussion of XML-integrated Languages and Demeter Interfaces..
For a quick example see the
Library XML Navigation Example.
For related slides see
Navigation Example.
XPath
is the navigation language for
XML. See
W3C Reports
for related documents.
XPath is used in several
other XML-related languages (XQL, XPointer, XSLT, etc.).
An interesting question for an
XPath expression is whether it is structure-shy, i.e., whether changes
to the underlying XML schema require many changes to the
XPath expression.
XPath is indeed structure-shy and supports a rich language for navigation
specification that includes a subset
of the Demeter strategies notation.
The most important construct in XPath are location paths.
The meaning of a location path and of an XPath expression in general is
an unordered collection of nodes without duplicates.
There are absolute and relative location paths. The absolute location paths
are relative location paths starting with a "/".
Of special interest are the abbreviated location paths because they give
us structure-shyness :
AbbreviatedAbsoluteLocationPath ::= '//' RelativeLocationPath
AbbreviatedRelativeLocationPath ::= RelativeLocationPath '//' Step
LocationPath ::= AbbreviatedRelativeLocationPath | AbbreviatedAbsoluteLocationPath
// is short for /descendant-or-self::node()/. For example, //para is short for /descendant-or-self::node()/child::para and so will select any para
element in the document.
A location path can be used as an expression. The expression returns the set of nodes selected by the path.
The | operator computes the union of its operands, which must be node-sets.
UnionExpr ::= PathExpr | UnionExpr '|' PathExpr
PathExpr ::= LocationPath | ...
/A//B//C corresponds to the three node strategy graph A->B->C also expressed as from A through B to C. /A//B//C | /A//B2//C corresponds to the 4 node strategy: A->B, B->C, A->B2, B2->C. But please note the difference in semantics: the XPath expression returns a node set, the strategy returns a path set that can be used (with a "collection" visitor) to compute the node set.
See series-parallel strategies journal paper and strategies paper on definition and compilation algorithms. The Demeter literature contains valuable information on how to write XPath code that is structure-shy.
What can XPath implementers learn from the Demeter experience? XPath expressions define possibly infinite sets of paths in the XML schema or DTD. The implementor needs to invent a suitable data structure to represent those possibly infinite path sets so that they can be used to guide the traversal of an XML document. We believe that the best way to represent the path sets is to use a so called traversal graph as described in the strategies paper.
The XPath definition is briefly summarized from a Demeter point of view. It is important to notice that XPath is from one point of view more general than traversal strategies and from another point of view more limited than traversal strategies. XPath is more general because it allows predicates to select paths. In Demeter, such predicates are expressed using visitors and around methods. XPath is limited compared to traversal strategies because XPath only allows union of location paths and not general graphs. XPath is also limited because the semantics of an XPath expression with respect to an XML schema is an unordered node set without repetitions. The focus is only on the target nodes. In Demeter, the semantics is a traversal history that also cares about how the internal nodes are visited.
The XPath document first introduces relative and absolute location paths and then abbreviated versions of them. The abbreviated versions are similar to line strategies in Demeter where the strategy graph is a straight line. A location path is described by a sequence of location steps.
XPath uses a richer data model than Demeter class dictionaries. In XPath we have root nodes, element nodes, text nodes, attribute nodes, namespace nodes, processing instruction nodes, comment nodes while in Demeter we only have element nodes, some of them distinguished as root nodes. XPath requires that the objects are tree objects. The following definitions are used: Every node other than the root node has exactly one parent, which is either an element node or the root node. A root node or an element node is the parent of each of its child nodes. The descendants of a node are the children of the node and the descendants of the children of the node. Some interesting XPath expressions are:
//Chapter//Paragraphselects the Paragraph element descendants of the Chapter element children and the Chapter descendants of the root node. Demeter equivalent:
from Root via Chapter to Paragraph.The XPath expression
//Olist/Itemselects all the Item elements that have an Olist parent. Between the root node and the Olist node there can be any number of nodes. The Demeter equivalent would be:
from Root to Olist only-through -> *,*,* to Item.only-through means that there is some edge between Olist and Item.
The | operator computes the union of its operands, which must be node-sets. The union operator corresponds to the merge operator in Demeter, i.e.. where a strategy graph node has at least two outgoing edges. But in XML, the merge operator can only be used at the outermost level. We cannot have: /A // (B | C) // D.
If XPath is not used carefully (i.e., by using abbreviated traversal specifications), it may lead to systems that are hard to maintain because they violate the Law of Demeter.
Watch the following sites on the Data Binding Approach
that mimics how Demeter generates classes from
class dictionaries.
Java Architecture for XML Binding (JAXB)
Data Binding Approach (IBM site)
Data Binding Approach (SUN site)
This describes JSR 31 (Java Service Request).
This facility is intended to
become part of the Java 2 Platform, Standard Edition.
We are currently working on a JSR 31 prototype implementation
that uses DemeterJ but it replaces class dictionaries
by XML schemas.
Data Binding Approach (BreezeFactor site)
What seems to be missing from the current XML work is the issue of generating visitors to facilitate the processing of XML documents. And the use of a navigation language a la DJ (http://www.ccs.neu.edu/research/demeter/DJ/) seems to be missing.
An interesting questions is why did the XML community go through the complex DOM and SAX based interfaces first before they arrived at the data binding approach? The Demeter work has promoted this approach starting in the late 1980s for LL(1) languages. Mark-up languages are a special case of LL(1) languages.
The company Innovision develops technology to translate XML schemas into Java applications. XML schemas are very similar to Demeter class dictionaries and therefore the Demeter translation approaches are relevant here. See the white paper on how traversal-visitor style programming is used and how various methods are automatically generated from the XML schema in a similar way as in Demeter/Java and Demeter/C++. (A local copy is here). New in Innovision's solution is the XML protocol generation.
An XML paper on structure-shyness.
Connections between XML schemas and UML class diagrams. For related slides see From UML to XML.
A JSP file is typically an XML document with embedded Java code that computes information to be displayed on a web page. A XSL file (style sheet) translates the XML file to HTML and determines the display of the page. The following aspects are involved: 1. The document structure (XML schema). 2. The document display (XSL style sheet). 3. The Java structure: the UML class graph for the Java classes involved. There is a connection between document structure and Java class structure. 4. The Java behavior: The adaptive methods attached to the Java classes. Both the Java behavior as well as the XSL style sheet can be written in a structure-shy way.
Professor Karl J. Lieberherr College of Computer and Information Science, Northeastern University 360 Avenue of the Arts Boston, MA 02115-9959 Phone: (617) 373 2077 / Fax: (617) 373 5121
Written Dec. 27, 1999. Updated Nov. 25, 2005.