Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

Creating an XML Document: The Way of Speed : Page 3

There are many ways to create an XML doc from inside a VB application. For example, you can use string concatenation, or the Xml Dom parser, or follow the SAX approach. Each one has its strengths and weeknesses, but did you ever ask yourself which is the fastest? The author of this article did it, and can now share his - somewhat surprising - results.


advertisement
The results

At this point, if you've read until here, you should be interested in what I discovered about performance, so let see a nice table (sorry, no graphics):



 

Fathers x Children x Size

File Size

DOM

SAX

String Class

Nodes

 

 

 

 

 

10000

100 x 100 x 10

620215

490

290

287

20000

200 x 100 x 10

1240415

950

565

531

50000

500 x 100 x 10

3101015

2430

1390

1301

 

 

 

 

 

 

Nodes (100 chars, 100 children)

 

 

 

 

 

10000

100 x 100 x 100

2420215

530

490

345

20000

200 x 100 x 100

4840415

1045

960

662

50000

500 x 100 x 100

12101015

2665

2261

1712

 

 

 

 

 

 

Nodes (10 chars, 10 children only)

 

 

 

 

 

10000

1000 x 10 x 100

618015

515

315

381

20000

2000 x 10 x 100

1236015

1045

630

752

50000

5000 x 10 x 100

3090015

2645

1543

1877

 

 

 

 

 

 

Nodes (10 chars, 1000 children)

 

 

 

 

 

10000

10 x 1000 x 10

656075

515

290

271

20000

20 x 1000 x 10

1312135

995

565

532

50000

50 x 1000 x 10

3280315

2475

1372

1305

The previous chart must be read as follows:

The first two columns define how the XML file is formed: how many LEVEL1 nodes, how many LEVEL2 and how long they are. The second column is the size (in bytes) of the resulting XML file. The next three columns are the time (in milliseconds, got using the Stopwatch class that Karl Petersen has also included in the String Builder sample project) that has been measured. The measures are the average for 5 runs on a PIII 650 notebook with 256 MB of ram.

What we can get from the previous chart?

First is that, if you just need the fastest way to create an XML document without any need to elaborate it afterwards (like, for example, just write it on disk or pass it to a Web Browser) and, at the same time, let someone else (Microsoft programmers) worry about all the features that makes an XML document well formed it's best to look at the really sparse documentation from Microsoft and learn MXXMLWriter.

Second is that the DOM is not so slow. Before starting this experiment I believed that using DOM to create really large documents was prone to poor performances, because of the overhead that the DOM Object Model requires to keep in memory all the information that allow you to navigate it. But it's not so. Considering the elegance of the DOM model and all the technologies that relies on it (like XSLT, validation, XPath and so on) if you do not need the ultimate performance the tradeoff between performances and the usability of DOM tends heavily towards DOM. Beside that it seems (from Microsoft figures related to the actual beta code) that the upcoming MS XML 4.0 parser will greatly improve performances.

Last, but not least, is that if performance is your ultimate goal and the document you're working with is very large maybe give a try to use a simple string concatenation. This is suitable to the task only for particular documents (for example documents for which you're sure that no character escaping is needed, like a document containing only numbers) and if you work with language where string operations can be performed really fast, but if you need the ultimate performance just look at the chart.

Conclusions

From the test I've made and considering the long list of capabilities of DOM versus the poor documentation of SAX, I believe that using DOM to write documents every time you need it is the way to go. Only if you need the ultimate performances it's best to use SAX or investigate fast string manipulation.

It would be interesting to make more experiments (like increasing the depth of the nodes making this another variable of the test), the code is free so you can do it. Consider also that the XML document created is really simple so, as is it said, your mileage may vary.

Beside that, you also need to consider what you need to do with the XML document you've created. If, for example, you need to apply an XSL stylesheet to it, you'd better use DOM so that you do not spend time reloading the XML file in another DOM document. If you need to write the file on disk and you're using Visual C++ you can investigate string functions and the file mapping functions to write the file the fastest way possible.

For any comment you can email to: massimogentilini@hotmail.com



Massimo Gentilini works in Italy for Gruppo Formula SpA where he leads a group of highly skilled developers to create and maintain a full Microsoft DNA compliant Supply Chain Management application. When he's not working he usually spend time on a golf course and manages the golf related web site you can browse at http://www.phillo.it/
Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap
Thanks for your registration, follow us on our social networks to keep up-to-date