COM 3360 Fall 99 Term Project
Part 1
Evaluating Xpath for Structure-Shyness
Submitted by : Vibhas & Prakash.
Faculty : Prof Karl Lieberherr.
Index
Xpath Basics *
Defining Structure-Shyness *
Analysis Approach *
UML for the test application *
Xpath In Action *
Test Files *
Test Cases. *
Conclusion *
Future Work *
Looking
at Structure-Shyness of Xpath
Xpath is a standard from W3C (a
"recommendation" in W3C parlance as opposed to "working drafts") that can
be used to navigate/address parts of an XML document and is meant to be
used with XSLT & XPointer standards from W3C which are the languages
for document-transformation and Multi-directional links respectively. This
project attempts to evaluate the structure-shyness of Xpath.
Xpath is used in conjunction with XSLT & XPointer standards. XSLT (XML Style Language Transformations) is used to convert XML document from one form to another (typically HTML). It takes two input files - an XML document along with it’s DTD or equivalent and an XSLT language instructions file and outputs another xml document. XPointer specifies complex Links like bi-directional links and allows the location of a specific part of XML document. Any fragment identifier that points into an XML resource must be an Xpointer.
Both these languages need to navigate an XML document and hence the navigation part was extracted out as a separate language – Xpath.
XPath gets its name from the use of a path notation as in URLs for navigating through the hierarchical structure of an XML document. It uses a compact, non-XML syntax to facilitate it’s use within URIs and XML attribute values and operates on the abstract, logical structure of an XML document, rather than its surface syntax. The primary syntactic structure of XML is an expression that evaluates to yield an object which can be one of the four types below:
|
|
|
|
| / | Gets all matching children | Chapter / Section / *: will give all children nodes of the node "Section" which also are grandchildren of the node "Chapter". |
| // | Gets all matching descendants` | Chapter // Paragraph : Will return all paragraphs that are descendants of the Chapter node no matter how many nodes come in between. |
| @ | Gets all matching attributes | @UNITS : will return value of the UNITS attribute for the current node. |
| Comment() | Gets the comment node | Chapter / * / comment() : Will return the comment node (tag) of all children nodes belonging to the chapter node. |
| [ ] | Gets nodes satisfying a condition. | Chapter [Annotation] : Will return Chapter nodes that contain Annotation attribute (assuming the attribute is optional). |
Rather than a formal definition,
we feel it would suffice in the context of this project to define structure-shyness
as follows : a piece of code involving any kind of traversal is structure-shy
if details of the route followed along traversal are not "hard-wired" into
it. This in essence is also the implication of the "Law Of Demeter" (LOD)
and forms the basis of flexible software development.
We adopted a two step approach.
In the first we executed various Xpath traversals on a sample XML application
data file. In the second step we attempted to find equivalent Xpath commands
for all the Dememter Java traversal primitives as determined by the AP
book. Since Xpath is an emerging standard (started in April 1999 and version
1.0 was published only this month), not many tools were found to support
this yet. Of those that were found (for example Caucho ‘s Resin 1.0 ) supplied
Xpath Java APIs that had to be coded in a program to create our testing
tool. We decided to use a tool that required minimum efforts to set up
the testing tool. The best appeared to be a tool called XT that was written
by James Clark (author of W3C ‘s Xpath).
XT is a Java utility that needs two input files - a .xml file containing XML document and .xsl file containing XSLT intructions with embedded Xpath instructions. The output would be another XML file. No compiling or elaborate coding is necessary to test the various navigational combinations. Please note that the abbreviated version of Xpath syntax is used throughout this report.
Here is an example of .xml file
<?xml version="1.0"?> <?xml-stylesheet type="text/xml" href="14-2.xsl"?> <PERIODIC_TABLE> <ATOM STATE="GAS"> <NAME>Hydrogen</NAME> <SYMBOL>H</SYMBOL> <ATOMIC_NUMBER>1</ATOMIC_NUMBER> <ATOMIC_WEIGHT>1.00794</ATOMIC_WEIGHT> <BOILING_POINT UNITS="Kelvin">20.28</BOILING_POINT> <MELTING_POINT UNITS="Kelvin">13.81</MELTING_POINT> <DENSITY UNITS="grams/cubic centimeter"><!-- At 300K --> 0.0899 </DENSITY> </ATOM> <ATOM STATE="GAS"> <NAME>Helium</NAME> <SYMBOL>He</SYMBOL> <ATOMIC_NUMBER>2</ATOMIC_NUMBER> <ATOMIC_WEIGHT>4.0026</ATOMIC_WEIGHT> <BOILING_POINT UNITS="Kelvin">4.216</BOILING_POINT> <MELTING_POINT UNITS="Kelvin">0.95</MELTING_POINT> <DENSITY UNITS="grams/cubic centimeter"><!-- At 300K --> 0.1785 </DENSITY> </ATOM> </PERIODIC_TABLE>And here is an example of .xsl file with embedded Xpath instructions. The SELECT and MATCH attributes in the file below can contain any valid Xpath expression.
<?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/PERIODIC_TABLE/*"> <html> <body> <h1>Atomic Number vs. Melting Point</h1> <table> <th>Element</th> <th>Atomic Number</th> <th>Melting Point</th> <xsl:apply-templates/> </table> </body> </html> </xsl:template> <xsl:template match="ATOM"> <tr> <td> <xsl:value-of select="NAME"/> </td> <td> <xsl:value-of select="child::ATOMIC_NUMBER"/> <- Alt Syntax </td> <td><xsl:value-of select=" ATOMIC_NUMBER|ATOMIC_WEIGHT|SYMBOL"/>
</td> </tr> </xsl:template> <xsl:template match="MELTING_POINT"> <xsl:value-of select="."/> <xsl:value-of select="@UNITS"/> </xsl:template> </xsl:stylesheet>
We used an Invoice-Data application schema as our test example. This consisted of the following files located on CCS machines at - ~vibhasp/com3360/project/Xpath:
XT :
A combination of test cases that
include elements, attributes, "/", "//" and the wild card character * were
tested for evaluating structure-shyness. See the section output files corresponding
to test case numbers for the actual tests done.
|
|
|
|
|
| 1. | Fetching an element | Invoice | Returns all Invoice Nodes including children. |
| 2. | Fetching an attribute | @Type | Returns all Type attributes. |
| 3. | Fetching a grandchild element for all parent | Invoice/*/Entry | Returns all grandchild Entry nodes of the node Invoice. |
| 4. | Fetching all descendant children | Invoice_collection //Annexture | Returns all descendant Annexture nodes of root. |
| 5. | Fetch an element using it's position | Invoice[2] | Returns the second Invoice node. |
| 6. | Fetch all Product Names for Invoices having 2 entries. | Invoice [Entries/@n = "2"] // @Prod_name | Returns all Prod_name attributes for invoices having 2 entries. |
Comparing Xpath with Dem Java Traversal Primitives
Table 1: The following test cases
were equivalent for both Demeter Java & Xpath.
|
|
|
|
|
| 1. | {Invoice_collection to Product}. | Invoice_collection//Product | Equivalent |
| 2. | Join Operation {Invoice_collection to Address via sales}. | Invoice_collectio//Person[Type = "sales"] // Address | Equivalent |
| 3. | Bypassing
{Invoice_collection to Address bypassing sales}. |
Invoice_collectio//Person[not(@Type = "sales")] // Address | Equivalent |
Table 2: The following test cases
were where Xpath had no equivalents for Demeter Java .
|
|
|
|
|
| 1. | Merge Operation {Invoice_collection to {Name,Address}. | Still searching for equivalent syntax. |
Table 3: The following test cases
were where Demeter Java had no equivalents for Xpath.
|
|
|
|
|
|
| 1. | @Maker | Get Attribues | The main problem here is there is no direct way of distinguishing child node from attributes in Dem java. |