Build Your Own Lightweight XML DOM Parser

Editor’s Note: This tutorial is also available in a version for Java developers. Get the Java version.

ML is rapidly gaining popularity with application developers as a data storage and exchange format because of its readability, ease of use, and firewall friendliness. Microsoft’s full-featured MSXML parser is large and powerful?and consumes resources commensurate with that power at installation and run time. The MSXML parser is well over 1MB.

Full-featured XML parsers offer rich functionality; they support XML namespaces, DTD and schema validation, multiple character-set encodings, etc.; however, not every real-world project requires all that functionality. An application may need only a simple data transfer protocol within a closed environment, or may work with XML data that’s always valid and uses only simple ASCII characters. Under those circumstances, employing a full-blown XML parser is probably overkill. Furthermore, if you’re deploying the application over the Web or to a handheld device, network bandwidth or system memory constraints may make using a full-featured XML parser unpalatable.

Enter the SimpleDOMParser
The SimpleDOMParser is a highly simplified and ultra-lightweight XML DOM parser written in pure VB. The source code is fewer than 400 lines long.

Obviously, with such a small code base, the SimpleDOMParser won’t support XML namespaces, understand multiple character-set encoding, or validate documents against a DTD or schema; but what the SimpleDOMParser can do well is parse a stream of well-formed XML tags into a DOM-like element tree, letting you perform the common task of extracting data from XML-formatted text.

Why use DOM as a model rather than SAX? The DOM provides an easier-to-use programming interface than SAX. Unlike SAX, when you process an XML document as a DOM tree, all the information within the document is always available. Although the SAX parsing model provides better performance and far less memory usage than the DOM model, most developers have, at times, found themselves building a complete or partial DOM tree anyway when using SAX. Using SAX, an application processes only one tag at a time. If other tags’ contents have to be utilized during the processing, you must maintain some sort of global state throughout the process. Maintaining that global state is essentially the purpose of the DOM model. Still, many small XML applications don’t need the full DOM model. Therefore, the SimpleDOMParser provides access to tag names, hierarchy, and content but doesn’t bother with many features found in the full W3C DOM specification.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin


The Latest

microsoft careers

Top Careers at Microsoft

Microsoft has gained its position as one of the top companies in the world, and Microsoft careers are flourishing. This multinational company is efficiently developing popular software and computers with other consumer electronics. It is a dream come true for so many people to acquire a high paid, high-prestige job

your company's audio

4 Areas of Your Company Where Your Audio Really Matters

Your company probably relies on audio more than you realize. Whether you’re creating a spoken text message to a colleague or giving a speech, you want your audio to shine. Otherwise, you could cause avoidable friction points and potentially hurt your brand reputation. For example, let’s say you create a

chrome os developer mode

How to Turn on Chrome OS Developer Mode

Google’s Chrome OS is a popular operating system that is widely used on Chromebooks and other devices. While it is designed to be simple and user-friendly, there are times when users may want to access additional features and functionality. One way to do this is by turning on Chrome OS