Login | Register   
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

Improve XPath Efficiency with VTD-XML  : Page 4

Even though XML and XPath have been around for several years, there's still room for performance improvements—and VTD-XML and its XPath implementation provide them.


advertisement
Common Usage Patterns
There are a few common patterns to follow in order to base your application on VTD-XML.

Embed evalXPath(..) in while Statements
The most common pattern is to embed the call to evalXPath() in the control statement of a while loop. So far the examples you've seen code all use this pattern to navigate to the qualified nodes in the document. In this pattern, the while loop checks the return value of evalXPath(); when it equals -1, there are no more matching nodes in the node set. Otherwise, the return value equals the VTD record index of the matching node, to which the cursor of the VTDNav object (bound to the AutoPilot object) also points. The following two short code fragments illustrate this—although syntactically slightly different, they produce equivalent results:

int i; ap.bind(vn); ap.selectXPath(...); while( (i=ap.evalXPath())!=-1){ } int i; ap.bind(vn); ap.selectXPath(...); while(ap.evalXPath()!>=0){ i = vn.getCurrentIndex(); }

Combining XPath with Manual Navigation
But sometimes you want to navigate the VTDNav cursor a bit further within the while loop. To do that, the rule is that the code segment nested in the loop must not alter the node position, meaning that you can either navigate where you like, and then manually move the cursor back to the original location before the end of the loop, or you can use the VTDNav object's push() and pop() methods, which push and pop cursor locations onto a stack, to comply with the rule. I'll show you both patterns.

Using the catalog.xml file in Listing 1, this first example navigates manually within the while loop to the first child node of each matching element, extracts the text, and then moves the cursor back to the starting position before the next iteration:

import com.ximpleware.*; public class vtdXPath2 { public static void main(String args[]) throws Exception { VTDGen vg =new VTDGen(); int i; AutoPilot ap = new AutoPilot(); ap.selectXPath("/CATALOG/CD[PRICE < 10]"); if (vg.parseFile("catalog.xml", false)) { VTDNav vn = vg.getNav(); ap.bind(vn); //XPath eval returns one node at a time while((i=ap.evalXPath())!=-1) { // get to the first child if (vn.toElement(VTDNav.FIRST_CHILD, "TITLE")) { int j = vn.getText(); if (j!=-1) System.out.println(" text node ==>"+vn.toString(j)); vn.toElement(VTDNav.PARENT); // move the cursor back } } ap.resetXPath(); } } }

This second example (which is slightly less efficient) uses push() and pop() to reset the cursor position:



import com.ximpleware.*; public class vtdXPath3 { public static void main(String args[]) throws Exception { VTDGen vg =new VTDGen(); int i; AutoPilot ap = new AutoPilot(); ap.selectXPath("/CATALOG/CD[PRICE < 10]"); if (vg.parseFile("catalog.xml", false)) { VTDNav vn = vg.getNav(); ap.bind(vn); //XPath eval returns one node at a time while((i=ap.evalXPath())!=-1) { // push the current cursor position vn.push(); // get to the first child if (vn.toElement(VTDNav.FIRST_CHILD, "TITLE")) { int j = vn.getText(); if (j!=-1) System.out.println(" text node ==>"+vn.toString(j)); } // restore the cursor position vn.pop(); } ap.resetXPath(); } } }

Nested XPath Support
You aren't limited to manual navigation within the while loop. For more complicated navigation, you can nest XPath queries. The following example is equivalent to the examples in the preceding section, but has been rewritten to use nested XPath queries. Note that the rule that ensures cursor position consistency still applies:

import com.ximpleware.*; public class vtdXPath4 { public static void main(String args[]) throws Exception { VTDGen vg =new VTDGen(); int i; AutoPilot ap = new AutoPilot(); ap.selectXPath("/CATALOG/CD[PRICE < 10]"); AutoPilot ap2 = new AutoPilot(); ap2.selectXPath("TITLE/text()"); if (vg.parseFile("catalog.xml", false)) { VTDNav vn = vg.getNav(); ap.bind(vn); ap2.bind(vn); //XPath eval returns one node at a time while((i=ap.evalXPath())!=-1) { vn.push(); // get to the first child int j; while ((j=ap2.evalXPath())!=-1) { System.out.println(" text node ==>" + vn.toString(j)); } // don't forget this next statement; the next iteration will reuse // ap2's XPath!!! ap2.resetXPath(); vn.pop(); } ap.resetXPath(); } } }

Ad-Hoc Tree-walker Mode
As mentioned earlier, one way to use evalXPath() is just to move the cursor to the desired destination(s) in the XML document:

import com.ximpleware.*; public class vtdXPath5 { public static void main(String args[]) throws Exception { VTDGen vg =new VTDGen(); int i; AutoPilot ap = new AutoPilot(); ap.selectXPath("/CATALOG/CD[PRICE < 10]"); if (vg.parseFile("catalog.xml", false)) { VTDNav vn = vg.getNav(); ap.bind(vn); //XPath eval returns one node at a time if (ap.evalXPath()>=0) { // do whatever you like here with vn

 
Figure 1. XPath Evaluation Performance Comparison: The charts in this figure show the average latencies of several different query executions for JAXEN, Xalan, and VTD-XML against three documents of similar structure, but differing in size.
} } } }
XPath Evaluation Performance Benchmarks
Finally, to round out the performance claims, the figures in this section briefly show the results of comparing VTD-XML's XPath performance with that of Xalan and JAXEN, both of which work with the Xerces DOM. The benchmark programs pre-compile a set of XPath expressions and then repetitively execute the queries over the same set of documents. The charts in Figure 1 show the average latencies of several different query executions for JAXEN, Xalan, and VTD-XML against three documents of similar structure, but differing in size.

You can read a complete description and discussion of the test set up here, but the figures shown here should give you a very good idea of the overall results.

In this article, you've seen an introduction and overview of VTD-XML's XPath implementation. VTD-XML empowers you to write clean, readable code and achieve peerless performance, efficiency, and flexibility. In fact, VTD-XML is changing some of the most deeply entrenched perceptions and assumptions about XML programming. Consequently, the infrastructure built on top of XML, such as Web services and SOA, is on the verge of positive changes as well.



Jimmy Zhang is a co-founder of XimpleWare, a provider of high-performance XML processing solutions. He has experience in the fields of electronic design automation and Voice-over IP, having worked for a number of Silicon Valley high-tech companies. He graduated from UC Berkeley with an MS and a BS from the department of EECS.
Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap