First published by IBM at http://www-106.ibm.com/developerworks/xml/library/x-tipstx4/index.html
Until recently, programmers had only two choices when creating XML documents programmatically. Their first option was to directly write serialized XML content to the output stream, and the second was to use DOM.
Both options have severe drawbacks. In the first case, the programmer is fully responsible for ensuring that the resulting document is well formed. The programmer must take care of details such as matching start and end tags or the escaping of special characters, such as the less than sign (<) and the ampersand (&), in character content. This can make the implementation of programs tedious and error prone. DOM, on the other hand, frees the programmer from this burden but introduces considerable overhead: The complete document must first be constructed as a node tree in memory before it can be serialized to an output stream.
Enter StAX
The Streaming API for XML (StAX) completely changes this. Unlike the Simple API for XML (SAX), StAX offers an API for writing XML documents. To be precise, it offers two APIs: a low-level, cursor-based API (XMLStreamWriter), and a higher-level, event-based API (XMLEventWriter). While the cursor-based API is best used in data binding scenarios (for example, creating a document from application data), the event-based API is typically used in pipelining scenarios where a new document is constructed from the data of input documents.
The following example is implemented using the cursor-based API. (I will discuss the event-based API in the next tip.) The cursor-based API offers a variety of specific methods for creating the various elements of the XML information set, such as elements, attributes, processing instructions, data type declarations, and character content. These methods take care of many formatting issues. For example, the method writeCharacters() automatically escapes characters like the less than sign (<), the greater than sign (>), and the ampersand (&). And the method writeEndDocument() automatically closes all open structures. So it does not matter if the last call to writeEndElement() in the example is commented out or not.
StAX can even generate namespace prefixes for namespaces that have not been formally declared. However, this is only done when the property javax.xml.stream.isPrefixDefaulting has been set to true for the output factory. If this property has been set to false, you must explicitly declare each namespace prefix and each namespace using the methods setPrefix() and writeNamespace(). In Listing 1, I have commented out these method calls because I have set prefix defaulting to true.
Listing 1. Writing documents
import javax.xml.stream.*;
public class XMLWriter {
// Namespaces
private static final String GARDENING = "http://com.bdaum.gardening";
private static final String XHTML = "http://www.w3.org/1999/xhtml";
public static void main(String[] args) throws XMLStreamException {
// Create an output factory
XMLOutputFactory xmlof = XMLOutputFactory.newInstance();
// Set namespace prefix defaulting for all created writers
xmlof.setProperty("javax.xml.stream.isPrefixDefaulting",Boolean.TRUE);
// Create an XML stream writer
XMLStreamWriter xmlw =
xmlof.createXMLStreamWriter(System.out);
// Write XML prologue
xmlw.writeStartDocument();
// Write a processing instruction
xmlw.writeProcessingInstruction(
"xml-stylesheet href='catalog.xsl' type='text/xsl'");
// Now start with root element
xmlw.writeStartElement("product");
// Set the namespace definitions to the root element
// Declare the default namespace in the root element
xmlw.writeDefaultNamespace(GARDENING);
// Writing a few attributes
xmlw.writeAttribute("productNumber","3923-1");
xmlw.writeAttribute("name","Nightshadow");
// Declare XHTML prefix
// xmlw.setPrefix("xhtml",XHTML);
// Different namespace for description element
xmlw.writeStartElement(XHTML,"description");
// Declare XHTML namespace in the scope of the description element
// xmlw.writeNamespace("xhtml",XHTML);
xmlw.writeCharacters(
"A tulip of almost black color. \nBlossoms in April & May");
xmlw.writeEndElement();
// Shorthand for empty elements
xmlw.writeEmptyElement("supplier");
xmlw.writeAttribute("name","Floral22");
// xmlw.writeEndElement();
// Write document end. This closes all open structures
xmlw.writeEndDocument();
// Close the writer to flush the output
xmlw.close();
}
}
|
Note that StAX does not guarantee well-formed documents. It is still possible to produce a document that violates the XML recommendation, such as a document with several root elements or several XML prologues, or tag and attribute names containing whitespace or characters not supported by XML. StAX implementations may check these issues but they are not required to do so (the reference implementation doesn't). Nevertheless, the StAX XMLStreamWriter is a big improvement over outputting raw XML data, and it does this at a fraction of the cost of using DOM.
Summary
This tip has demonstrated the use of the cursor-based API of StAX for writing XML documents efficiently. In the next tip, I will show how to merge two XML documents using the event-based API.
Resources