Parsing XML with the DOM API
This section explains how to parse an XML document (the XML document created in the previous section) with a DOM parser. DOM parsing creates an in-memory tree that mirrors the structure of the parsed XML document. Subsequently, you can navigate the tree using the DOM API. In this case, you'll see how to iterate over the parsed XML document and output element and attribute node values. The DOM parsing API classes are in the
oracle.xml.parser.v2 package and the DOM parser factory and parser classes are in the
oracle.xml.jaxp package. Open the class file you created earlier named
SAXParserApp.java in JDeveloper, and import the two packages:
import oracle.xml.jaxp.*;
import oracle.xml.parser.v2.*;
Create a JXDcoumentBuilderFactory object as before:
JXDocumentBuilderFactory factory = (JXDocumentBuilderFactory)
JXDocumentBuilderFactory.newInstance();
Set the
ERROR_STREAM and
SHOW_WARNINGS attributes on the factory object with the
setAttribute() method by passing an OutputStream or a PrintWriter object for the
ERROR_STREAM value, and
Boolean.TRUE or
Boolean.FALSE for the
SHOW_WARNINGS attribute value. With the OutputStream or PrintWriter specified in
ERROR_STREAM attribute, parsing errors (if any) get output to the specified file. If the
ErrorHandler event handler is also set,
ERROR_STREAM is not used. The
SHOW_WARNINGS attribute causes the parser to output warnings also:
factory.setAttribute(JXDocumentBuilderFactory.ERROR_STREAM, new
FileOutputStream(new File("c:/output/errorStream.txt")));
factory.setAttribute(JXDocumentBuilderFactory.SHOW_WARNINGS,
Boolean.TRUE);
Create a JXDocumentBuilder object from factory object by first creating a DocumentBuilder object with the
newDocumentBuilder() method and then cast it to JXDocumentBuilder:
JXDocumentBuilder documentBuilder = (JXDocumentBuilder)
factory.newDocumentBuilder();
You can get a Document object from a JXDocumentBuilder object using one of the
parse() methods in the JXDocumentBuilder class, passing as input an InputSource, InputStream, File, or String URI. For the example XML document, create an InputStream and call the
parse(InputStream) method:
InputStream input = new FileInputStream(
new File("c:/J2EEApp/catalog.xml"));
XMLDocument xmlDocument = (XMLDocument)
(documentBuilder.parse(input));
The JXDocumentBuilder
parse() methods return a Document object, which you can cast to XMLDocument because that class implements the Document interface.
Here's how to output the encoding and version of the XML document:
System.out.println("Encoding: " + xmlDocument.getEncoding());
System.out.println("Version: " + xmlDocument.getVersion());
The XMLDocument class has various getter methods to retrieve elements in a document, some of which I've listed in Table 5.
Table 5. The table lists getter methods in the XMLDocument class.
Method Name |
Description |
getDocumentElement() |
Returns the root element. |
getElementById(String) |
Returns element for a specified ID. |
getElementsByTagName(String) |
Returns a NodeList of elements for a specified tag name |
getElementsByTagNameNS(String, String) |
Returns a NodeList of elements for a specified namespace URI and local name. |
As an example, suppose you want to retrieve title elements in the namespace
http://xdk.com/catalog/journal. Here's the code:
NodeList namespaceNodeList =
xmlDocument.getElementsByTagNameNS(
"http://xdk.com/catalog/journal","title");
You can iterate over the resulting NodeList to output the element namespace, element, prefix, element tag name, and/or element text. The method calls are fairly self-explanatory; for example, the
getNamespaceURI() method returns the namespace URI of an element, the
getPrefix() method returns the prefix of an element in a namespace, and the
getTagName() method returns the element tag name. You can obtain an element's text by first retrieving the text node from the element node and then requesting the value of the text node:
for (int i = 0; i < namespaceNodeList.getLength(); i++)
{
XMLElement namespaceElement = (XMLElement)
namespaceNodeList.item(i);
System.out.println("Namespace Prefix: " +
namespaceElement.getNamespaceURI());
System.out.println("Namespace URI: " +
namespaceElement.getPrefix());
System.out.println("Element Name: " +
namespaceElement.getTagName());
System.out.println("Element text: " +
namespaceElement.getFirstChild().getNodeValue());
}
You obtain the root element in the XML document using the
getDocumentElement() method:
XMLElement rootElement = (XMLElement)
(xmlDocument.getDocumentElement());
System.out.println("Root Element is: " +
rootElement.getTagName());
Often, you want to iterate through an entire document. To do that, starting with the root element, you obtain a NodeList of sub-nodes by calling
getChildNodes(). Then, you iterate over that NodeList and recursively obtain the sub-nodes of any nodes in the list that have sub-nodes. Several methods may prove useful. The method
hasChildNodes() tests whether a node has sub-nodes. The NodeList interface method
getLength() returns the length of a node list, and the method
item(int) returns the node at a specified index. Because the class XMLNode implements the Node interface, you can cast returned Node objects to the XMLNode type:
if (rootElement.hasChildNodes()) {
NodeList nodeList = rootElement.getChildNodes();
for (int i = 0; i < nodeList.getLength(); i++) {
XMLNode node = (XMLNode) nodeList.item(i);
}
}
If a retrieved node is of type
ELEMENT_NODE, you can retrieve the element's tag name. You can find out the Node type using the
getNodeType() method, which returns a short value representing one of the node types listed in Table 6.
Table 6. Node Types: All XML nodes return one of these type constants when queried with the getNodeType() method.
Node Type |
Description |
ELEMENT_NODE |
Element node |
ATTRIBUTE_NODE |
Attribute node |
TEXT_NODE |
Text node |
CDATA_SECTION_NODE |
CData section node |
ENTITY_REFERENCE_NODE |
Entity reference node |
ENTITY_NODE |
Entity node |
PROCESSING_INSTRUCTION_NODE |
Processing Instruction node |
COMMENT_NODE |
Comment node |
DOCUMENT_NODE |
Document node |
DOCUMENT_TYPE_NODE |
Doctype node |
DOCUMENT_FRAGMENT_NODE |
DocumentFragment node |
NOTATION_NODE |
Notation node |
The following example prints an element's tag name only if it's an
ELEMENT_NODE type:
if (node.getNodeType() == XMLNode.ELEMENT_NODE) {
XMLElement element = (XMLElement) node;
System.out.println("Element Tag Name: "+element.getTagName))
}
You retrieve attributes from an element node with the
getAttributes() method, which returns a NamedNodeMap of attributes. Similar to the NodeList methods, the NamedNodeMap's
getLength() method returns the length of an attribute node list, and the
item(int) method returns an Attr object for the attribute at the specified index. You can cast Attr objects to XMLAttr. The method
hasAttributes() tests if an element node has attributes. Here's an example that checks to see whether an element has attributes, retrieves them if so, and then iterates over the NamedNodeMap to output each attribute's name and value:
if (element.hasAttributes()) {
NamedNodeMap attributes = (NamedNodeMap)
element.getAttributes();
for (int i = 0; i < attributes.getLength(); i++) {
XMLAttr attribute = (XMLAttr)attributes.item(i);
System.out.println(" Attribute: " + attribute.getName() +
" with value " +attribute.getValue());
}
}
Running the Parser Application
To run the sample application
DOMParserApp.java in JDeveloper, right click on the DOMParserApp.java node in Applications Navigator and select Run (see
Figure 5).
 | |
Figure 5. Running the Parsing Application: To run the parsing application, right click on the DomParserApplication.java item in JDeveloper, and select Run from the popup menu. |
|
 | |
Figure 6. Sample Application Output: The figure shows the output from the sample DOMParserApp.java run in JDeveloper. |
|
|
When you run the application, you'll see the element and attribute values from the XML document displayed as shown in
Figure 6.
Listing 3 shows the complete output from the DOM parsing application.
To demonstrate error handling with the
ERROR_STREAM attribute, add an error in the example XML document, such as removing a
element. Run the application again, and you'll see an error message output to the file specified in the
ERROR_STREAM attribute:
<Line 15, Column 10>: XML-20121: (Fatal Error) End tag
does not match start tag 'journal'.