Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

StAX: DOM Ease with SAX Efficiency : Page 2

StAX (the Streaming API for XML) is a memory-efficient, simple, and convenient way to process XML while retaining control over the parsing and writing process.


advertisement

Patterns for Using StAX

If your XML is anything more than trivial, you'll find that putting all that parsing logic inside one large event loop can quickly become unmanageable and hard to maintain. A better way to do this is to group logically related units of parsing work into discrete components that can be called from within the main event loop.

Take the following ATOM XML feed file as an example:

?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> <title>Simple Atom Feed File</title> <subtitle>Using StAX to read feed files</subtitle> <link href="http://example.org/"/> <updated>2006-01-01T18:30:02Z</updated> <author> <name>Feed Author</name> <email>doofus@feed.com</email> </author> <entry> <title>StAX parsing is simple</title> <link href="http://www.devx.com"/> <updated>2006-01-01T18:30:02Z</updated> <summary>Lean how to use StAX</summary> </entry> </feed>



To make life easy, create a small piece of infrastructure. Start by defining a ComponentParser interface that defines the contract between the main StAX event loop and parsing components:

public interface ComponentParser { public void parseElement(XMLStreamReader staxXmlReader) throws XMLStreamException; }

This allows parsing components to be dealt with in a common way through the interface.

Define two concrete parsers: one to parse ATOM author elements and one to parse ATOM entry elements. Ensure that they implement the ComponentParser interface.

The following is the AuthorParser class:

public class AuthorParser implements ComponentParser{ public void parse(XMLStreamReader staxXmlReader) throws XMLStreamException{ // read name StaxUtil.moveReaderToElement("name",staxXmlReader); String name = staxXmlReader.getElementText(); // read email StaxUtil.moveReaderToElement("email",staxXmlReader); String email = staxXmlReader.getElementText(); // Do something with author data... } }

The following is the EntryParser class:

public class EntryParser implements ComponentParser { public void parse(XMLStreamReader staxXmlReader) throws XMLStreamException{ // read title StaxUtil.moveReaderToElement("title",staxXmlReader); String title = staxXmlReader.getElementText(); // read link attributes StaxUtil.moveReaderToElement("link",staxXmlReader); // read href attribute String linkHref = staxXmlReader.getAttributeValue(0); // read updated StaxUtil.moveReaderToElement("updated",staxXmlReader); String updated = staxXmlReader.getElementText(); // read title StaxUtil.moveReaderToElement("summary",staxXmlReader); String summary = staxXmlReader.getElementText(); // Do something with the data read from StAX.. } }

The StaxUtil class is just a helper class for reading from the StAX reader until the code finds the target element. Note that you should take care to (1) read elements in the correct order, (2) not read past the end of the stream, and (3) not read data that belongs to other ComponentParsers.

In the main event loop, modify the code to farm out parsing work to ComponentParsers based on the XML element name. ComponentParsers can be pre-registered with the main class prior to parsing. The advantage of this pattern is that it keeps the main event loop code simple and devoid of any understanding of the ATOM XML format. ComponentParsers still pull data from StAX, but they are neatly separated and can be reused (e.g., in recurring elements in the XML hierarchy). You can now apply the loop to parse any XML file, provided you registered the appropriate ComponentParsers. The following is the main event loop using a component parser registry:

public class StaxParser implements ComponentParser { private Map delegates; … public void parse(XMLStreamReader staxXmlReader) throws XMLStreamException{ for (int event = staxXmlReader.next(); event != XMLStreamConstants.END_DOCUMENT; event = staxXmlReader.next()) { if (event == XMLStreamConstants.START_ELEMENT) { String element = staxXmlReader.getLocalName(); // If a Component Parser is registered that can handle // this element delegate… if (delegates.containsKey(element)) { ComponentParser parser = (ComponentParser) delegates.get(element); parser.parse(staxXmlReader); } } } //rof } }

Here's how you would put it all together in a test case:

InputStream in = this.getClass().getResourceAsStream("atom.xml"); XMLInputFactory factory = (XMLInputFactory) XMLInputFactory.newInstance(); XMLStreamReader staxXmlReader = (XMLStreamReader) factory.createXMLStreamReader(in); StaxParser parser = new StaxParser(); parser.registerParser("author",new AuthorParser()); parser.registerParser("entry",new EntryParser()); parser.parse(staxXmlReader);



Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap