Login | Register   
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

A Step in the Right Direction: VTD-XML Improves XML Processing  : Page 2

Find out how this next generation XML processing API goes beyond DOM and SAX in performance, memory usage, and ease-of-use.


advertisement
Figure 1. DOM vs. VTD-XML: Results of the memory usage comparison.







VTD-XML's Memory and Performance
After parsing, VTD-XML doesn't create a lot of objects in-memory, but instead allocates large memory blocks to store VTD tokens and XML structural information. To quickly see the effectiveness of this approach, I compare the memory usage of VTD-XML vs. Xerces DOM (bundled with JDK) for XML documents between one and 400 megabytes in size, and the benchmark programs (shown below) were compiled and run using JDK Version 1.5 on a Athlon64 3400+ box with 1GB of memory and running Windows XP:

VTD-XML Code Measuring Memory Usage

DOM Code Measuring Memory Usage

import com.ximpleware.*;
import java.io.*;

public class benchmark_mem {
     static Runtime rt;
 public static void main(String[] args){
     File f = new File(args[0]);    
     long l;
     int t;
     try{
         FileInputStream fis = new FileInputStream(f);
         rt = Runtime.getRuntime();
         long startMem = rt.totalMemory() - rt.freeMemory();
         byte[] ba = new byte[(int)f.length()];
         t=fis.read(ba);
         VTDGen vg = new VTDGen(); 
         int fl = (int) f.length();
         l = System.currentTimeMillis();
         vg.setDoc(ba);
         vg.parse(true);
         long endMem = rt.totalMemory() - rt.freeMemory();
         System.out.println("Memory Use: " + ((float)endMem -           startMem)/(1<<20) + " MB.");
   }
   catch (Exception e){
       System.out.println("exception ==> "+e);
     }
 }
}
import org.w3c.dom.*;
import org.w3c.*;
import javax.xml.parsers.*;
import org.w3c.dom.Document;
import org.xml.sax.SAXException;

public class benchmarkDOM_mem {
    static Runtime rt;
  public static void main(String[] args){
    File f = new File(args[0]);
    try{
       FileInputStream fis = new FileInputStream(f);
       rt = Runtime.getRuntime();
       byte[] ba = new byte[(int)f.length()];
       fis.read(ba);
       long startMem = rt.totalMemory() - rt.freeMemory();
       DocumentBuilderFactory factory =
          DocumentBuilderFactory.newInstance();
       factory.setNamespaceAware(true);
       factory.setExpandEntityReferences(false);
       DocumentBuilder parser = factory.newDocumentBuilder();
       ByteArrayInputStream bais = new ByteArrayInputStream(ba);
       parser.parse(bais);
       long endMem = rt.totalMemory() - rt.freeMemory();
       System.out.println("Memory Use: " + ((float) endMem - startMem)/(1<<20) + " MB.");
     }
     catch (Exception e){
       System.out.println("exception ==> "+e);
     }
   }
}

Table 1. VTD-XML vs. Xerces DOM: Comparing memory usage.

Table 2 (shown below) lists the results of memory consumption measurement. The test code was initialized with Hotspot's server JVM (standard with JDK1.5) with a maximum heap size of 800 megabytes. Parsing "po_huge.xml" exhausts all the memory of JVM and results in "OutOfMemoryException;" VTD-XML on the other hand passed the test without any problems.

File Name/Size in MB Description  VTD-XML's Memory Usage Multiplying Factor DOM's Memory Usage Multiplying Factor
blog.xml (1.3 MB)    RSS feed from infoworld   1.64 MB  1.29x     5.35 MB  4.2x
bioinfo.xml (4.4MB)    bio-informatics data file   6.05 MB  1.42x   27.08 MB  6.3x
address.xml (15.24 MB)    address book data   26.39 MB  1.73x   109.83 MB  7.2x
cd.xml (25.57 MB)    CD catalog file   48.67 MB  1.90x   211.48 MB  8.3x
po.xml (70.3MB)    purchase order 122.57 MB  1.68x   514.03 MB  7.05x
po_huge.xml (405.60MB)    super-sized purchase order  686.0 MB  1.69x   Out of memory  Out of memory

Table 2. Memory Usage Comparison: Between VTD-XML and DOM.



Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap