advertisement
Premier Club Log In/Registration
  Include Code  Search Tips
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   SKILLBUILDING  |   TIP BANK  |   SOURCEBANK  |   FORUMS  |   NEWSLETTERS
Browse DevX
Download the Code for this Article
Partners & Affiliates
advertisement
advertisement
Average Rating: 4.5/5 | Rate this item | 2 users have rated this item.
 Print Print
 
Improve XPath Efficiency with VTD-XML
Even though XML and XPath have been around for several years, there's still room for performance improvements—and VTD-XML and its XPath implementation provide them. 

advertisement
or the past several years, XPath has been steadily gaining popularity as an effective tool when developing XML applications. XPath was originally viewed as an adjunct element for the W3C's XSLT and XPointer specifications, but developers found its simplicity appealing. With XPath, instead of manually navigating the hierarchical data structure, you can use compact, "file-system"-like expressions to address any node or set of nodes in XML documents. However, most existing XPath engines work with DOM trees or similar object models, which are slow to build and modify—and consume excessive amounts of memory. This presents a dilemma for anyone looking to take advantage of XPath for SOA applications that are either performance sensitive or routinely deal with large XML documents.


My last two articles with DevX (see the Related Resources) introduced VTD-XML as a next-generation XML processing model that goes beyond DOM and SAX in performance, memory usage, and ease of use. VTD-XML is simultaneously:

  • Memory-efficient: VTD-XML typically requires only somewhere between 1.3 ~ 1.5 times the size of the XML document itself—meaning it's far more memory-efficient than DOM—and works very well with large XML documents.
  • High-performance: VTD-XML typically outperforms DOM parsers by 5 ~ 10 times, and it typically outperforms SAX parsers with null content handlers by about 100 percent
  • Easy to use: Applications written in VTD-XML are more compact and readable than those written in DOM or SAX.
What is VTD-XML's not-so-secret sauce? Unlike traditional XML parsers which create string-based tokens as the first step of parsing, VTD-XML uses linear buffers internally to store 64-bit integers containing the starting offsets and lengths of XML tokens, while keeping the un-decoded XML document intact in memory. All VTD-XML's benefits are the result—one way or the other—of this "non-extractive" tokenization. At the API level, VTD-XML consists of the following core classes:

  • VTDGen encapsulates the main parsing, index writing, and index loading functions.
  • VTDNav exports a cursor-based API that contains the methods that navigate the XML hierarchy.
  • AutoPilot supports document-order element traversal—similar to Xerces' NodeIterator.
VTD-XML's XPath Implementation
VTD-XML's XPath implementation, introduced with version 1.0, supports the full W3C XPath 1.0 spec. It builds upon VTDNav's concept of cursor-based navigation. The AutoPilot class exports all the XPath-related methods. As described in one of the earlier articles, to manually navigate an XML document's hierarchical structure, you obtain a VTDNav instance, and repeatedly call the toElement() method to move the cursor to various parts of the document. Using XPath you can either move the cursor manually or tell AutoPilot to move it to qualified nodes in the document automatically.

Table 1 shows AutoPilot's XPath-related methods.

Table 1. AutoPilot's XPath-related Methods: The table lists AutoPilot's XPath-related methods along with a short description of each.
Method Description
declareXPathNameSpace(...) Binds a namespace prefix (used in the XPath expression) to a URL.
selectXPath(...) Compiles an XPath expression into an internal representation.
evalXPath(...) Moves the cursor to a qualified node in the node set.
evalXPathToBoolean(...), evalXPathToNumber(...), evalXpathToString(...) These three methods evalute an XPath expression to a Boolean, a double and a string, respectively.
resetXPath() Resets the internal state so the XPath can be re-used.
getExprString() Call this method to verify the correctness of the compiled expression.

VTD-XML's XPath implementation also introduces two exception classes:

  • XPathParseException—Thrown when there is a syntax error in the XPath expression.
  • XPathEvalException—Thrown when an exception condition occurs during XPath evaluation.
Page 1 of 4


advertisement
  Next Page: Non-Blocking Node Set Evaluation
Page 1: IntroductionPage 3: Qerying Multiple Documents
Page 2: Non-Blocking Node Set EvaluationPage 4: Common Usage Patterns
advertisement
Advertising Info  |   Member Services  |   Permissions  |   Contact Us  |   Help  |   Feedback  |   Site Map  |   Network Map  |   About


JupiterOnlineMedia

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

Solutions
Whitepapers and eBooks
Microsoft Article: Will Hyper-V Make VMware This Decade's Netscape?
Microsoft Article: 7.0, Microsoft's Lucky Version?
Microsoft Article: Hyper-V--The Killer Feature in Windows Server 2008
Avaya Article: How to Feed Data into the Avaya Event Processor
Microsoft Article: Install What You Need with Windows Server 2008
HP eBook: Putting the Green into IT
Whitepaper: HP Integrated Citrix XenServer for HP ProLiant Servers
Intel Go Parallel Portal: Interview with C++ Guru Herb Sutter, Part 1
Intel Go Parallel Portal: Interview with C++ Guru Herb Sutter, Part 2--The Future of Concurrency
Avaya Article: Setting Up a SIP A/S Development Environment
IBM Article: How Cool Is Your Data Center?
Microsoft Article: Managing Virtual Machines with Microsoft System Center
HP eBook: Storage Networking , Part 1
Microsoft Article: Solving Data Center Complexity with Microsoft System Center Configuration Manager 2007
MORE WHITEPAPERS, EBOOKS, AND ARTICLES
Webcasts
Intel Video: Are Multi-core Processors Here to Stay?
On-Demand Webcast: Five Virtualization Trends to Watch
HP Video: Page Cost Calculator
Intel Video: APIs for Parallel Programming
HP Webcast: Storage Is Changing Fast - Be Ready or Be Left Behind
Microsoft Silverlight Video: Creating Fading Controls with Expression Design and Expression Blend 2
MORE WEBCASTS, PODCASTS, AND VIDEOS
Downloads and eKits
Sun Download: Solaris 8 Migration Assistant
Sybase Download: SQL Anywhere Developer Edition
Red Gate Download: SQL Backup Pro and free DBA Best Practices eBook
Red Gate Download: SQL Compare Pro 6
Iron Speed Designer Application Generator
MORE DOWNLOADS, EKITS, AND FREE TRIALS
Tutorials and Demos
How-to-Article: Preparing for Hyper-Threading Technology and Dual Core Technology
eTouch PDF: Conquering the Tyranny of E-Mail and Word Processors
IBM Article: Collaborating in the High-Performance Workplace
HP Demo: StorageWorks EVA4400
Intel Featured Algorhythm: Intel Threading Building Blocks--The Pipeline Class
Microsoft How-to Article: Get Going with Silverlight and Windows Live
MORE TUTORIALS, DEMOS AND STEP-BY-STEP GUIDES