s a J2EE developer you might sometimes be interested in comparing a modified XML document with the pre-modified version of that document. Or you might want to compare two XML documents that are both based on the same DTD or XML schema. This article is designed to give you the information you need to answer two questions:
- Is one XML document the same as another?
- What are the differences between two XML documents?
In this article, I’ll show you how to use XDK 10g Production, which provides a mechanism to compare two XML documents and list the differences between them.
Oracle XML Developer’s Kit (XDK) 10g is a set of APIs, tools and utilities to help you develop XML applications. XDK 10g is available for Java, C and C ++; however, this article discusses only the Java-compatible version. XDK 10g supports DOM 3.0 and XSLT 2.0 specifications. XDK 10g provides a schema validation API to validate XML documents against an XML schema. You can use the XML Class generator API in XDK 10g to bind XML schema to Java classes, and subsequently marshal and unmarshal XML documents. The XML SQL Utility aids in storing XML documents in a RDBMS database. A supplied XSQL servlet combines XML, SQL, and XSLT to generate an XML document from a SQL query.
To get started, first download Oracle’s XDK 10g. You need to add the XMLDiff class package to your classpath. Add both
Using the XMLDiff Class
You use the XMLDiff class in the oracle.xml.differ package to compare two XML documents. The class contains methods to compare two XML documents and enumerating the differences between them. In addition, you can choose to generate an XSLT stylesheet consisting of the differences between the two XML documents. You can then use the stylesheet to convert one of the compared XML documents to the other. This article compares an example XML document named catalog.xml (see Listing 1) with another XML document called catalog2.xml (see Listing 2).
Download the sample code that accompanies this article, create a directory named XMLDiff, and copy the two XML documents catalog.xml and catalog2.xml to that directory.
In comparing XML documents the empty (white) spaces in the XML document are also considered as nodes. You should remove white space in XML documents you want to compare if the white space is not required for comparison.
Comparing XML Documents with the XMLDiff Class
To compare two documents, you first, import the XMLDiff class package oracle.xml.differ into a Java application as shown below.
Next, create a DOMParser to parse the XML documents to be compared. The DOMParser class extends the oracle.xml.parser.v2.XMLParser class.
DOMParser parser=new DOMParser();
You have to load and parse the documents, using one of the parse() methods in the XMLParser class. You can load and parse an XML document from an InputSource, InputStream, Reader, String, or URL. I’ve used an InputStream in this article. Create an InputStream object from the XML document catalog.xml and parse the document with the parse(InputStream) method.
InputStream catalog1=new FileInputStream(new File("C:/XMLDiff/catalog.xml")); parser.parse(catalog1);
Obtain the XMLDocument object corresponding to the XML document parsed.
Similarly, create an InputStream for XML document catalog2.xml and parse the XML document. Next, obtain an XMLDocument object for the XML document, catalog2.xml as shown below.
InputStream catalog2=new FileInputStream(new File("C:/XMLDiff/catalog2.xml")); parser.parse(catalog2); XMLDocument xmlDocument2=parser.getDocument();
The XMLDiff class is an interface for comparing two XML documents. You’ll need to create a XMLDiff class. The downloadable sample Java application compares the two sample XML documents, and extends the XMLDiff class as the XMLCompare class. Here’s how you create an XMLCompare class object.
XMLCompare xmlDiff=new XMLCompare();
Specify the XMLDocuments to be compared, either as oracle.xml.parser.v2.XMLDocument class objects, or as java.io.File objects. You can set them as XMLDocument objects with the setDocuments(XMLDocument, XMLDocument) method or (using two calls) setInput1(XMLDocument) and setInput2(XMLDocument) methods. Alternatively, you can set them as File objects using the setFiles(File, File) method, or the two setInput1(File) and setInput2(File) methods. The sample code uses the setDocuments(XMLDocument, XMLDocument) method as shown below.
You compare the two example XML documents with the diff() method, which returns a boolean.
If the value of the diff variable is false, the two XML documents are the same, while if diff is true, the documents are different. Using the example documents shown in Listing 1 and Listing 2, you’ll get a value of true for the diff variable, which indicates that the documents are different.
You can also compare nodes using the equals(Node, Node) method, which also returns a boolean.
Now that you know the documents are different, you might be interested in listing the actual differences. You can generate a listing of the differences using the printDiffTree(int, BufferedWriter) method. The int parameter specifies which XML document to use as the base document when evaluating the differences. In other words, if the int parameter value is 1, the printDiffTree() method outputs the additions/deletions/modifications in XML document 1 as compared to XML document 2, while if int is 2, the method outputs the differences for document 2 as compared to document 1. The BufferedWriter parameter specifies the output file. To obtain the results, specify the int parameter value as 1 and create a BufferedWriter to output the differences between the XML documents, as shown below.
BufferedWriter bufferedWriter=new BufferedWriter( new FileWriter(new File("c:/XMLDiff/diff.txt"))); xmlDiff.printDiffTree(1, bufferedWriter); bufferedWriter.flush();
A BufferedWriter outputs the set of features that are different between the two XML documents. Listing 3 shows the sample output containing the differences between the two example documents.
The output is fairly straightforward. The MODIFIED keyword indicates XML document elements that are present in both documents, but that differ (are modified) between the two. Added elements are indicated by the keyword ADDED, and deleted elements by DELETED. Note that the elements that are marked DELETED would be marked ADDED, and elements marked ADDED would be marked DELETED if the int value in printDiffTree() method is modified from 1 to 2.
Generating an XSLT Stylesheet
You can generate an XSLT stylesheet from the element/attribute differences between the example XML documents. To create an XMLDocument object, use the generateXSLDoc() method. To create it as a file, use the generateXSLFile(java.lang.String filename) method.
The XSLT file generated is illustrated in Listing 4, although I’ve added all the white space for readability.
You can use the XSLT document generated from the differences between the two example documents (see Listing 3) to update the first document to the second document. For example, to apply the diff.xslt file to the XML document catalog.xml, you can use the following command:
>oraxsl catalog.xml diff.xslt
That command generates the XML document catalog2.xml.
Alternatively, you can apply the XSLT stylesheet to some other document that may have additional differences from the second example document. For example, applying the stylesheet diff.xslt would update only the modified elements/attributes (including those added or removed) between catalog.xml and catalog2.xml.
Additionally, you can use the generated XSLT stylesheet to generate an XML document that consists solely of the modified attribute and element values between the two input XML documents. To do that, apply the XSLT to an XML document that does not specify values for any of the attributes and elements as shown below.
To apply the diff.xslt to the preceding XML document with null values, use the command:
>oraxsl catalog-null.xml diff.xslt >catalog-diff.xml
The preceding command generates an XML document containing only modified attributes and elements as shown here:
Understanding Optimization Aggregate Data with XQuery Nilesh Junnarkar
Listing 5 shows the sample XMLCompare.java program used to compare the two example XML documents.
As you can see, the XDK 10g Production package contains everything you need to compare two XML documents in Java, list the differences between them, make one document match the other, or create a new document containing only the attributes and elements that differ.