WEBINAR:
On-Demand
Building the Right Environment to Support AI, Machine Learning and Deep Learning
Navigation Performance
Next, I benchmarked the navigation performances of VTD-XML and Xerces DOM. Because navigating behavior is specific to the file structure and tags names of the test documents, the test code (shown below) first parses
po.xml and
po_huge.xml, then traverses the document hierarchies corresponding to the XPath expression
/purchseOrder/item/items[@partNum='872-AA'], which navigates across the entire document from the beginning to the end. Comparing the code written for DOM and VTD-XML, you can see that VTD-XML API is much simpler than DOM.
VTD-XML Code Measuring Navigation Performance |
DOM Code Measuring Navigation Performance |
for (int j=0;j<20;j++){
l = System.currentTimeMillis();
for(int i = 0;i<total;i++)
{
count = 0;
vn.toElement(VTDNav.ROOT);
if (vn.matchElement("purchaseOrder")){
if (vn.toElement(VTDNav.FIRST_CHILD,"items")){
do {
if (vn.toElement(VTDNav.FIRST_CHILD)){
do {
temp =
vn.getAttrVal("partNum");
if (vn.matchTokenString(temp,"872-AA")){
count++;
}
}
while(vn.toElement(VTDNav.NEXT_SIBLING));
vn.toElement(VTDNav.PARENT);
}
} while (vn.toElement(VTDNav.NEXT_SIBLING,
"items"));
}
}
}
long l2 = System.currentTimeMillis();
System.out.println("l2 - l "+ (l2-l)+ " ms");
System.out.println(" average nav time ==> "+
((double)(l2 - l)/total)); |
for (int j=0;j<20;j++){
l = System.currentTimeMillis();
for(int i = 0;i<total;i++)
{
Element current = d.getDocumentElement();
count = 0;
if (current.getNodeName().compareTo("purchaseOrder")==0){
Node n = current.getFirstChild();
if (n != null){
do {
if (n.getNodeType() ==
Node.ELEMENT_NODE
&&
n.getNodeName().compareTo("items")==0){
Node n1 = n.getFirstChild();
Element e;
do {
if
(n1.getNodeType() == Node.ELEMENT_NODE
&& n1.getNodeName().compareTo("item")==0){
e
= (Element) n1;
if (e.getAttribute("partNum").compareTo("872-AA")==0 ){
count++;
}
}
} while ((n1 =
n1.getNextSibling())!= null);
}
} while ((n=n.getNextSibling()) !=
null );
}
}
}
long l2 = System.currentTimeMillis();
System.out.println("l2 - l "+ (l2-l)+ " ms");
System.out.println(" average nav time ==> "+
((double)(l2
- l)/total));
} |
Table 5. VTD-XML vs. Xerces DOM: Comparing navigation performances.
The timing results, shown in Table 6, demonstrate that VTD-XML's random access capability is quite similar to DOM. Even for XML documents exceeding 400 MB in size (for which DOM simply runs of gas) VTD-XML doesn't miss a beat in its ability to jump between different nodes in the document hierarchy, and the navigation latency scales linearly with the size of the document.
File Name (Size ) |
Navigation Performance VTD-XML |
Navigation Performance DOM |
po.xml (70.3 MB) |
241.4ms |
303.9ms |
po_huge.xml (405.60MB) |
1303.2ms |
N/A (Out of Memory) |
Table 6. Navigation performance summary.
What Does All This Mean For You?
How will using VTD-XML change the way you do things? VTD-XML will first affect your choice of processing model. VTD-XML exposes the key weakness of SAXlack of random access. So, unless the XML document is too big to load into memory, you no longer have any incentives to go with SAX. Written in VTD-XML, your program code should be shorter, cleaner, and less bug-prone. And because VTD-XML's benefits apply to XML documents of different sizes and complexity, you no longer have to switch between the radically different parsing styles of DOM and SAX. This makes it easier to accomplish anything that XML, by design, allows you to do. Click here to download a representative example XML file and test it out yourself.
Switching to VTD-XML also brings instant hardware upgrade. If you have been thinking about adding more boxes to keep up with the growing amount of XML, writing your applications in VTD-XML may be all you need. Whether it is batch processing or real time transactions, VTD-XML should help you squeeze out every drop of efficiency and make your applications run smoother and more responsive.