 | |
| Figure 2. Parsing Throughput Comparison: The results for VTD-XML, DOM, and SAX. |
Parsing Performance
To compare the parsing performance of VTD-XML with Xerces DOM, I ran the benchmark code (shown blow) on the same set of test files used for the above memory test using the HotSpot server JVM, which does an excellent job performing JIT (Just In Time) compilation from byte code to native code. To measure the peak performance of each parser, the sample code first went through a warm-up stage to ensure that the parsing routines were executed in its native mode to obtain maximum performance. The test files were first read into byte arrays in order for the results to exclude any timing variation due to disk IO.
| VTD-XML Sample Code Measuring Parsing
Performance |
DOM Sample Code Measuring Parsing Performance |
l =
System.currentTimeMillis();
while(System.currentTimeMillis()-l<30000)
{
vg.setDoc(ba);
vg.parse(true);
}
for (int j=0;j<20;j++){
l = System.currentTimeMillis();
for(int i = 0;i<total;i++)
{
vg.setDoc(ba);
vg.parse(true);
}
long l2 = System.currentTimeMillis();
System.out.println("l2 - l "+ (l2-l)+ " ms");
System.out.println(" average parsing time ==> "+
((double)(l2 - l)/total));
System.out.println(" performance ==> "+
( ((double)fl *1000 * total)/((l2 -
l)*(1<<20))));
} |
l =
System.currentTimeMillis();
while(System.currentTimeMillis()-l<30000)
{
ByteArrayInputStream bais = new
ByteArrayInputStream(ba);
parser.parse(bais);
}
for(int j=0;j<10;j++) {
l = System.currentTimeMillis();
for(int i = 0;i<total;i++)
{
ByteArrayInputStream bais = new
ByteArrayInputStream(ba);
parser.parse(bais);
}
long l2 = System.currentTimeMillis();
System.out.println(" average parsing time ==> "+
((float)(l2 - l)/total));
System.out.println(" performance ==> "+
( ((double)fl
*1000 * total)/((l2 - l)*(1<<20))));
} |
Table 3. VTD-XML vs. Xerces DOM: Comparing parsing performance.
The results of parsing performance are listed in Table 4. Notice that, as another reference point, I also measure the performance of Xerces SAX with Null content handler:
| File Name (Size in MB ) |
VTD-XML (time) |
VTD-XML (MB/sec) |
DOM (time) |
DOM (MB/sec) |
SAX (time) |
SAX (MB/sec) |
| blog.xml (1.3 MB) |
17.98 ms |
70.8 |
89.1 ms |
14.3 |
22.65 ms |
56.18 |
| bioinfo.xml (4.4MB) |
68.8 ms |
62.04 |
406.2 ms |
10.5 |
103.0 ms |
41.44 |
| address.xml (15.24 MB) |
293.8 ms |
51.87 |
1703.2 ms |
8.95 |
556.2 ms |
27.40 |
| cd.xml (25.57 MB) |
493.8 ms |
51.8 |
2825.0 ms |
9.13 |
759.4 ms |
33.68 |
| po.xml (70.3 MB) |
1337 ms |
54.50 |
7015 ms |
10.35 |
2743.8 ms |
34.46 |
| po_huge.xml (405.60MB) |
6.87 s |
59.76 |
Out of memory |
Out of memory |
10.756 s |
37.687 |
Table 4. Parsing performance summary.
As Table 4 demonstrates, VTD-XML is not only much faster than DOM, it also significantly outperforms SAX with Null content handler. This is, again, the result of VTD-XML's superior memory allocation strategy.