devxlogo

Creating and Parsing XML Documents in JDeveloper

Creating and Parsing XML Documents in JDeveloper

racle’s XDK 10G extends JAXP to make reading, writing, and querying XML easy.

One of the first programming exercises for beginning XML developers is to create and parse an XML document. The Java API for XML Parsing (JAXP) includes an API for doing this. Oracle XDK 10g extends the base JAXP capabilities by providing an API in the oracle.xml.parsers.v2 package that overrides some of the classes in the JAXP API and also implements some additional interfaces such as DocumentEditVAL, ElementEditVAL, and NSResolver, which add dynamic validation and aid in using XPath. XDK also provides parser factory and parser classes in the oracle.xml.jaxp package which override the parser classes in the javax.xml.parsers package.

The oracle.xml.parsers.v2 package provides APIs for DOM and SAX parsing. The oracle.xml.jaxp package provides APIs for obtaining parsers for DOM and SAX parsing.

This article shows you how to create and parse an XML document using JDeveloper. Listing 1 shows the XML document that you’ll create and work with in the examples.

Note that some of the elements and attributes in the example XML document in Listing 1 belong to a specific namespace, http://xdk.com/catalog/journal, denoted by the prefix journal. I’ve included namespace nodes in the sample document to demonstrate the API’s features for creating and parsing namespace nodes.

Setting the Environment

 
Figure 1. JDeveloper XML Application: Here’s how the project looks so far in the Applications Navigator.

To get started, download and install Oracle JDeveloper 10.1.3. Create an application workspace in JDeveloper by selecting File?> New?> General?> Application. Specify an application name, XMLParserApp, in the “Create Application” pane and click on the OK button. Specify a project name, XMLParser, in the “Create Project” pane and click on the OK button. Taking those actions adds a new application and a project to the JDeveloper Applications Navigator.

You’ll be creating applications that create an XML document, parse an XML document with the DOM API, and parse an XML document with the SAX API.

Select your new project node in Applications Navigator and select File?> New?> General?> Java Class. Click on the OK button. The project node is the node with the same name you specified as project name. In the Create Java Class pane, specify a class name (“CreateXMLDocument” for example), a package name, and click on the OK button. Similarly add two more Java applications named “DOMParserApp” and “SAXParserApp” to your JDeveloper project. The result should look like Figure 1.

Next, we need to ensure the XDK parser classes are in the classpath of the XMLParser project. Select the project node in Applications Navigator and select Tools?> Project Properties?> Libraries, and then click on the “Add Library” button to add a library. In the Add Library pane select the Oracle XML Parser v2 library (see Figure 2), and click on the OK button.

 
Figure 2. Adding a Library: Select the Oracle XML Parser v2 library from the Add Library list.
 
Figure 3. Project Libraries: You can see and manage the list of libraries added to your project from the Project Properties dialog.

That adds the Oracle XML Parser v2 library to your project’s library references (see Figure 3). Click on the OK button in the Project Properties pane to close the dialog.

Creating an XML Document
In this section you’ll see how to create the XML document shown in Listing 1 using the JDeveloper class you just created named CreateXMLDocument.java. Open that class file, and import the DOM and SAX parsing APIs package oracle.xml.parser.v2, and the DOM and SAX parsers package oracle.xml.jaxp:

   import oracle.xml.jaxp.*;   import oracle.xml.parser.v2.*; 

Create a JXDocumentBuilderFactory object with the static newInstance() method:

   JXDocumentBuilderFactory factory = (JXDocumentBuilderFactory)       JXDocumentBuilderFactory.newInstance();

The JXDocumentBuilderFactory class extends the DocumentBuilderFactory class and adds some static fields to set factory attributes to the class. Table 1 lists the most important attributes. You’ll use two of these, the ERROR_STREAM and SHOW_WARNINGS attributes later in the DOM parsing section.

Table 1. JXDocumentBuilderFactory Class Attributes: The table lists several of the most important attributes, along with a short description of each.
AttributeDescription
BASE_URLThe base URL for parsing entities.
DEBUG_MODEThis is a debug?mode switch that you can set to Boolean.TRUE or Boolean.FALSE
ERROR_STREAMThe error stream to use for reporting errors. The value is an OutputStream or PrintWriter. If the ErrorHandler event handler is set, ERROR_STREAM is not used.
SCHEMA_LANGUAGESchema language for validation.
SCHEMA_SOURCESchema source for validation.
SHOW_WARNINGSValue may be Boolean.TRUE or Boolean.FALSE.

Create a JXDocumentBuilder object using the factory object by calling the newDocumentBuilder() method. Cast the DocumentBuilder object returned by the method to a JXDocumentBuilder type as shown below:

   JXDocumentBuilder documentBuilder = (JXDocumentBuilder)       factory.newDocumentBuilder();

You obtain a Document object from the JXDocumentBuilder object by calling the newDocument() method. The XMLDocument class implements the Document interface, so you need to cast the returned Document object to an XMLDocument.

The XMLDocument class also implements several other interfaces:

  • DocumentEditVAL
  • ElementEditVAL
  • DocumentEvent
  • DocumentTraversal
  • EventTarget
  • NSResolver

The DocumentEditVAL and ElementEditVAL interfaces add dynamic validation capabilities as specified in the DOM 3 Validation specification. See my DevX article Using DOM 3.0 Validation Techniques with XDK 10g for more information.

The DocumentEvent and EventTarget interfaces add event-handling capabilities, while the NSResolver interface aids in selecting namespace nodes with XPath. The

The XMLDocument class also provides some methods not specified in any of the implemented interfaces. Table 2 shows some of these methods.

Table 2. Additional XMLDocument Methods: The table lists some additional XMLDocument methods along with a description of each.
MethodDescription
addID(String, XMLElement)Adds an ID Element for the document.
expectedElements(Element)Returns a vector of elements that may be added to the specified element.
getEncoding()Gets the document encoding.
setEncoding(String)Sets the document encoding
getVersion() Gets the XML version.
setVersion()Sets the XML Version.
getIDHashtable()Returns a hashtable of element IDs.
getSchema() Gets the XMLSchema specified in document.
setSchema(XMLSchema)Sets the XMLSchema for the document.
printExternalDTD(
OutputStream, String)
Prints an external DTD to the specified OutputStream using the specified encoding.
setDoctype(String rootname,
String sysid, String pubid)
Sets the Doctype.
setStandalone(String)Sets standalone mode for the XML declaration.
getStandalone()Gets the standalone mode for the XML declaration.

Creating an XML Document (continued)
To create any XML document, you first create an XML declaration by setting the XML version and the document encoding for output:

   xmlDocument.setVersion("1.0");   xmlDocument.setEncoding("UTF-8");

Then you create the remainder of the document nodes in sequence, creating each element and adding each element, starting with the root node. In this case, for example, create the root element using the createElement(String) method. Cast the Element object returned by createElement() method to XMLElement:

   XMLElement catalogElement = (XMLElement)       (xmlDocument.createElement("catalog"));

The XMLElement class implements the Element, ElementEditVAL, and NSResolver interfaces used for standard XML element features, DOM 3 Validation, and XPath namespace node selection respectively. In addition to the validation methods in ElementEditVAL, XMLElement class exposes overloaded validateContent() methods to validate an element.

Add the new root element to the XMLDocument object:

   xmlDocument.appendChild(catalogElement);

Create the namespace element with the createElementNS(String, String) method:

   XMLElement journalElement = (XMLElement)       (xmlDocument.createElementNS(      "http://xdk.com/catalog/journal","journal:journal"));

Add the journal element to the root element:

   catalogElement.appendChild(journalElement);

Add the namespace attribute journal:title with the createAttributeNS(String, String, String) method:

   journalElement.setAttributeNS(      "http://xdk.com/catalog/journal",      "journal:title", "Oracle Magazine");

You create the journal:publisher and journal:author elements similarly. Add both elements to the journal:journal element.

Next, create an XMLText node to set the text of the title element using the createTextNode(String) method:

   XMLText title = (XMLText) xmlDocument.createTextNode(      "Creating Search Pages");

Add the XMLText node to the journal:title element.

   titleElement.appendChild(title);

The process to add the other elements and text nodes in the example XML document in Listing 1 is similar, so I won’t list it exhaustively here. The XMLDocument class provides additional methods to create XML document elements other than those discussed in this section, so I’ve listed some of them in Table 3.

Table 3. The table shows the various XMLDocument methods to create XML document content.
Method NameDescription
createCDATASection(
java.lang.String data)
Creates a CData section.
createComment(
java.lang.String data)
Creates a comment.
createEntityReference(
java.lang.String name)
Creates an entity reference.
createProcessingInstruction(
java.lang.String target,
java.lang.String data)
Creates a processing instruction.

When you’ve finished adding all the nodes, you output the XML document with the XMLPrintDriver class. First, create an OutputStream and then create an XMLPrintDriver by passing it the OutputStream.

   OutputStream output = new FileOutputStream(      new File( "c:/output/catalog.xml"));   XMLPrintDriver xmlPrintDriver = new XMLPrintDriver(      new PrintWriter(output));
 
Figure 4. Running the Application. Right click on the CreateXMLDocument.java item in JDeveloper and select Run from the popup menu.

Write the XML document to the output stream using the printDocument(XMLDocument) method.

   xmlPrintDriver.printDocument(xmlDocument);

XMLPrintDriver can print not only an XMLDocument node, but other nodes as well. Table 4 lists additional print methods in the XMLPrintDriver class that you may find useful.

You can see the complete listing for the CreateXMLDocument.java in Listing 2.

To run the completed CreateXMLDocument.java application in JDeveloper, right click on CreateXMLDocument.java in the Applications Navigator and select “Run” (see Figure 4). The application should generate the XML document.

Table 4. The table shows useful print Methods in the XMLPrintDriver class.
Print MethodDescription
printAttribute(XMLAttr)Prints an attribute node.
printAttributeNodes(XMLElement)Prints attributes in an element node.
printCDATASection(XMLCDATA)Prints a CData section node.
printChildNodes(XMLNode)Prints the child nodes of a node.
printComment(XMLComment)Prints a comment node.
printDoctype(DTD)Prints a DTD.
printDocument(XMLDocument)Prints a document.
printDocumentFragment(XMLDocumentFragment)Prints a document fragment.
printElement(XMLElement)Prints an element node.
printEntityReference(XMLEntityReference)Prints an entity reference node.
printProcessingInstruction(XMLPI)Prints a processing instruction node.
printTextNode(XMLText)Prints a text node.

Parsing XML with the DOM API
This section explains how to parse an XML document (the XML document created in the previous section) with a DOM parser. DOM parsing creates an in-memory tree that mirrors the structure of the parsed XML document. Subsequently, you can navigate the tree using the DOM API. In this case, you’ll see how to iterate over the parsed XML document and output element and attribute node values. The DOM parsing API classes are in the oracle.xml.parser.v2 package and the DOM parser factory and parser classes are in the oracle.xml.jaxp package. Open the class file you created earlier named SAXParserApp.java in JDeveloper, and import the two packages:

   import oracle.xml.jaxp.*;   import oracle.xml.parser.v2.*;

Create a JXDcoumentBuilderFactory object as before:

   JXDocumentBuilderFactory factory = (JXDocumentBuilderFactory)       JXDocumentBuilderFactory.newInstance();

Set the ERROR_STREAM and SHOW_WARNINGS attributes on the factory object with the setAttribute() method by passing an OutputStream or a PrintWriter object for the ERROR_STREAM value, and Boolean.TRUE or Boolean.FALSE for the SHOW_WARNINGS attribute value. With the OutputStream or PrintWriter specified in ERROR_STREAM attribute, parsing errors (if any) get output to the specified file. If the ErrorHandler event handler is also set, ERROR_STREAM is not used. The SHOW_WARNINGS attribute causes the parser to output warnings also:

   factory.setAttribute(JXDocumentBuilderFactory.ERROR_STREAM, new       FileOutputStream(new File("c:/output/errorStream.txt")));   factory.setAttribute(JXDocumentBuilderFactory.SHOW_WARNINGS,       Boolean.TRUE);

Create a JXDocumentBuilder object from factory object by first creating a DocumentBuilder object with the newDocumentBuilder() method and then cast it to JXDocumentBuilder:

      JXDocumentBuilder documentBuilder = (JXDocumentBuilder)       factory.newDocumentBuilder();

You can get a Document object from a JXDocumentBuilder object using one of the parse() methods in the JXDocumentBuilder class, passing as input an InputSource, InputStream, File, or String URI. For the example XML document, create an InputStream and call the parse(InputStream) method:

   InputStream input = new FileInputStream(      new File("c:/J2EEApp/catalog.xml"));   XMLDocument xmlDocument = (XMLDocument)      (documentBuilder.parse(input));

The JXDocumentBuilder parse() methods return a Document object, which you can cast to XMLDocument because that class implements the Document interface.

Here’s how to output the encoding and version of the XML document:

   System.out.println("Encoding: " + xmlDocument.getEncoding());   System.out.println("Version: " + xmlDocument.getVersion()); 

The XMLDocument class has various getter methods to retrieve elements in a document, some of which I’ve listed in Table 5.

Table 5. The table lists getter methods in the XMLDocument class.
Method NameDescription
getDocumentElement()Returns the root element.
getElementById(String)Returns element for a specified ID.
getElementsByTagName(String)Returns a NodeList of elements for a specified tag name
getElementsByTagNameNS(String, String)Returns a NodeList of elements for a specified namespace URI and local name.

As an example, suppose you want to retrieve title elements in the namespace http://xdk.com/catalog/journal. Here’s the code:

   NodeList  namespaceNodeList =       xmlDocument.getElementsByTagNameNS(      "http://xdk.com/catalog/journal","title");

You can iterate over the resulting NodeList to output the element namespace, element, prefix, element tag name, and/or element text. The method calls are fairly self-explanatory; for example, the getNamespaceURI() method returns the namespace URI of an element, the getPrefix() method returns the prefix of an element in a namespace, and the getTagName() method returns the element tag name. You can obtain an element’s text by first retrieving the text node from the element node and then requesting the value of the text node:

   for (int i = 0; i < namespaceNodeList.getLength(); i++)    {      XMLElement namespaceElement = (XMLElement)          namespaceNodeList.item(i);      System.out.println("Namespace Prefix: " +         namespaceElement.getNamespaceURI());      System.out.println("Namespace URI: " +         namespaceElement.getPrefix());      System.out.println("Element Name: " +         namespaceElement.getTagName());      System.out.println("Element text:  " +         namespaceElement.getFirstChild().getNodeValue());   }

You obtain the root element in the XML document using the getDocumentElement() method:

   XMLElement rootElement = (XMLElement)       (xmlDocument.getDocumentElement());   System.out.println("Root Element is: " +       rootElement.getTagName());

Often, you want to iterate through an entire document. To do that, starting with the root element, you obtain a NodeList of sub-nodes by calling getChildNodes(). Then, you iterate over that NodeList and recursively obtain the sub-nodes of any nodes in the list that have sub-nodes. Several methods may prove useful. The method hasChildNodes() tests whether a node has sub-nodes. The NodeList interface method getLength() returns the length of a node list, and the method item(int) returns the node at a specified index. Because the class XMLNode implements the Node interface, you can cast returned Node objects to the XMLNode type:

   if (rootElement.hasChildNodes()) {      NodeList nodeList = rootElement.getChildNodes();      for (int i = 0; i < nodeList.getLength(); i++) {         XMLNode node = (XMLNode) nodeList.item(i);      }   }

If a retrieved node is of type ELEMENT_NODE, you can retrieve the element's tag name. You can find out the Node type using the getNodeType() method, which returns a short value representing one of the node types listed in Table 6.

Table 6. Node Types: All XML nodes return one of these type constants when queried with the getNodeType() method.
Node TypeDescription
ELEMENT_NODEElement node
ATTRIBUTE_NODEAttribute node
TEXT_NODEText node
CDATA_SECTION_NODECData section node
ENTITY_REFERENCE_NODEEntity reference node
ENTITY_NODEEntity node
PROCESSING_INSTRUCTION_NODEProcessing Instruction node
COMMENT_NODEComment node
DOCUMENT_NODEDocument node
DOCUMENT_TYPE_NODEDoctype node
DOCUMENT_FRAGMENT_NODEDocumentFragment node
NOTATION_NODENotation node

The following example prints an element's tag name only if it's an ELEMENT_NODE type:

   if (node.getNodeType() == XMLNode.ELEMENT_NODE) {     XMLElement element = (XMLElement) node;     System.out.println("Element Tag Name: "+element.getTagName))   }

You retrieve attributes from an element node with the getAttributes() method, which returns a NamedNodeMap of attributes. Similar to the NodeList methods, the NamedNodeMap's getLength() method returns the length of an attribute node list, and the item(int) method returns an Attr object for the attribute at the specified index. You can cast Attr objects to XMLAttr. The method hasAttributes() tests if an element node has attributes. Here's an example that checks to see whether an element has attributes, retrieves them if so, and then iterates over the NamedNodeMap to output each attribute's name and value:

   if (element.hasAttributes()) {      NamedNodeMap attributes = (NamedNodeMap)       element.getAttributes();      for (int i = 0; i < attributes.getLength(); i++) {        XMLAttr attribute = (XMLAttr)attributes.item(i);        System.out.println(" Attribute: " + attribute.getName() +           " with value " +attribute.getValue());      }   }

Running the Parser Application
To run the sample application DOMParserApp.java in JDeveloper, right click on the DOMParserApp.java node in Applications Navigator and select Run (see Figure 5).

 
Figure 5. Running the Parsing Application: To run the parsing application, right click on the DomParserApplication.java item in JDeveloper, and select Run from the popup menu.
 
Figure 6. Sample Application Output: The figure shows the output from the sample DOMParserApp.java run in JDeveloper.

When you run the application, you'll see the element and attribute values from the XML document displayed as shown in Figure 6.

Listing 3 shows the complete output from the DOM parsing application.

To demonstrate error handling with the ERROR_STREAM attribute, add an error in the example XML document, such as removing a element. Run the application again, and you'll see an error message output to the file specified in the ERROR_STREAM attribute:

   : XML-20121: (Fatal Error) End tag    does not match start tag 'journal'.

Parsing an XML Document with the SAX API
SAX parsing is based on a push model in which a SAX parser generates events, and a document handler receives notification of those events. The SAX parsing model is faster than the DOM parsing, but is limited in scope to generating parsing events; it has no provision for navigating nodes or retrieving nodes with XPath. This section shows you how to parse the example XML document with a SAX parser and handle the parser events. The sample SAX parsing application was developed in JDeveloper using the downloadable SAXParserApp.java application. First, import the oracle.xml.jaxp package:

   import oracle.xml.jaxp.*; 

A SAX parsing application typically extends the DefaultHandler class, which exposes event notification methods for the parse events. The DefaultHandler class implements the ErrorHandler interface, so you can use a DefaultHandler to handle errors directly. Create a JXSAXParserFactory object by calling the static method newInstance(). The newInstance() method returns a SAXParserFactory object that may be cast to JXSAXParserFactory because it extends the SAXParserFactory class:

   JXSAXParserFactory factory = (JXSAXParserFactory)       JXSAXParserFactory.newInstance();

Create a SAXParser object from the factory object with the newSAXParser() method. JXSAXParser extends SAXParser, so you can cast a SAXParser object to JXSAXParser:

   JXSAXParser saxParser = (JXSAXParser) factory.newSAXParser();

Create an InputStream for the XML document you want to parse and call one of the parse() methods in SAXParser class. The parse method parameters require an XML document in any of the forms InputSource, InputStream, URI, or File, and an event handler such as the DefaultHandler:

   InputStream input = new FileInputStream(      new File("c:/j2eeapp/catalog.xml"));   saxParser.parse(input, this);

The DefaultHandler class provides the parsing event notification methods and error handling methods, which you can override with custom event and error handling methods. Table 7 lists some of the event notification methods in the DefaultHandler class.

Table 7. DefaultHandler Event Notification Methods: The table lists the event notification methods and describes when each event fires.
Method NameDescription
startDocument()Fires when the parser reaches the start of the document.
startElement(
   java.lang.String uri,
   java.lang.String localName,
   java.lang.String qName,
   Attributes attributes)
Fires when the parser reaches the beginning of an element. The URI argument specifies the namespace URI. LocalName specifies the element's local name, which is the element name without the prefix. QName specifies the element's qualified name?with prefix. Attributes specifies the list of attributes in a element.
endDocument()Fires when the parser reaches the end of the document.
endElement(
   java.lang.String uri,
   java.lang.String localName,
   java.lang.String qName)
Fires when the parser reaches the end of an element.
characters(char[] ch,
   int start, int length)
Fires when the parser has read some text from a text node.

The SAXParserApp.java application overrides some of the notification methods to output the event type, element name, element attributes, and element text. For example, you can iterate through the attributes in the Attributes object and output the attribute name, namespace URI, and attribute value:

   for (int i = 0; i < atts.getLength(); i++) {      System.out.println("Attribute QName:" + atts.getQName(i));      System.out.println("Attribute Local Name:" +          atts.getLocalName(i));      System.out.println("Attribute Namespace URI:" +          atts.getURI(i));      System.out.println("Attribute Value:"+atts.getValue(i));   }

You can also override the error handler methods in the DefaultHandler class. The SAXParserApp.java application overrides some of the methods to output error messages. Table 8 lists the error handler methods in the DefaultHandler class.

Table 8. DefaultHandler Error Handler Methods: The table lists the DefaultHandler error handling methods, which you can override.
Method NameDescription
error(SAXParseException
   exception)
Receives notification of a recoverable error.
fatalError(SAXParseException
   exception)
Receives notification of a non-recoverable error.
warning(SAXParseException exception)Receives notification of a warning.

 
Figure 7. SAX Output: The figure shows part of the output from the SAX parsing application running in JDeveloper.

Listing 4 shows the complete code for the SAXParserApp.java application.

Figure 7 shows a portion of the output of the sample SAX parsing application running in JDeveloper, while Listing 5 shows the complete output.

Again, you can see how the application handles errors by adding an error in the example XML document such as removing a node. Run the SAXParserApp.java application again, and you'll see an error message:

   Fatal Error:: XML-20121:       (Fatal Error) End tag does not match       start tag 'journal'.

This article demonstrated and documented both DOM and SAX approaches to parsing an XML document. DOM parsing is suitable if you need to modify document nodes, or need random or repeated access to the nodes. SAX parsing is the better choice for large documents and documents where you need only parse events and node values.

devxblackblue

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist