Tracking XML Data Changes Easily with SDO

Tracking XML Data Changes Easily with SDO

racking data changes is an essential requirement in many software, application, and business-integration scenarios. Rigorously implementing this requirement is relatively difficult because modeling and working with the delta for typical changes is generally very involved. On the other hand, repetitively implementing it in all the applicable situations is a waste as a single proper model for the delta is suitable for many situations, and in most of cases, the requirements are similar. Service Data Object (SDO), a BEA Systems and IBM-led JSR defining a generic solution for heterogeneous data access, provides developers with an easy-to-implement mechanism for tracking data history at the system level.

This article shows an example of processing XML data with SDO using version 1.0 of Apache Tuscany, a Java implementation of SDO. Since SDO is not (yet) the standard solution for XML processing, the article also covers basic XML data operations in SDO to provide context.

Three Phases of Processing XML Data
The XML data-processing example in this article assumes the following three phases with a separate party responsible for each phase:

  1. Create
  2. Process
  3. Review

The XML data are transported between these phases (and parties) through a file system. The central scenario of this example is as follows: the second party needs to record changes he/she makes to an XML file created by the first party, and when the third party reviews the XML data, he/she hopes to know of such changes. If you use the Track Changes feature of Microsoft Word, you will recognize the value of these requirements immediately.

Many applications have these requirements, including optimistic concurrency-control implementation, synchronization of offline application data with an active database, and business process management (BPM) systems. The sections to follow demonstrate how SDO helps to implement these requirements easily.

The XML data for the example (based on a sample in the SDO 2.1.0 specification) models a purchase order (see Listing 1 for the schema po_original.xsd). The next section demonstrates how to create a purchase order based on this schema and persist it into a file system with SDO. The example uses the dynamic API. SDO also provides a static API for working with all data sources.

Create and Persist XML with SDO
The class (see Listing 2) completes the creation phase of the XML processing in this example. You should pay particularly close attention to the seven commented areas of this class:

  1. Define Types and Properties with XSD?Since your data are modeled by an XML schema, you first need to define the SDO Types and Properties in the runtime based on the schema. This is accomplished with XSDHelper through the definePOTypes() method in the Util class:
         public static void definePOTypes() throws Exception {          FileInputStream fis = new FileInputStream(PO_MODEL_ORIGINAL);          XSDHelper.INSTANCE.define(fis, null);          fis.close();     }

    (Section 9 of the SDO specification governs the actual mapping from XML schema entities to SDO types and properties. Refer to this section for more detail.).

  2. Create the root data object?The dynamic API of SDO represents structured data by either a hierarchy of data objects, each with its own properties, or a DataGraph, which packages a graph of data objects with their metadata. SDO provides the DataFactory interface for creating unconnected data objects and is therefore the appliance you need to create the root data object of the purchase order.
  3. Set a data type Property for the root data object ?Purchase order is a Type in SDO, and according to the schema, it has a data type property called orderDate, which is a date type. The line under comment 3 sets the orderDate with a date string. Alternatively, you could create a Java Date object, and use the setDate() method of DataObject.
  4. Create child data objects?The data object purchaseOrder has several child data objects. As an example, the line under comment 4 creates the shipTo child directly from purchaseOrder with the name “shipTo.” Alternatively, you could use DataFactory to create an unconnected shipTo data object and set it as a child of purchaseOrder with one of the setDataObject() methods of DataObject:
         DataObject shipTo = DataFactory.INSTANCE.create(CONSTANTS.PO_NAMESPACE, "USAddress");     ......     PurchaseOrder.setDataObject("shipTo", shipTo);
  5. Set data type Property for the child data object?Based on the USAddress type definition, the shipTo data object has various data type Properties. The lines below comment 5 create these properties.
  6. Create a child data object for the child data object “items”?This section shows the hierarchical nature of the SDO data model when modeling XML. Items is a child of the root data object purchaseOrder, and it contains several children of the item data object.
  7. Persist the XML data to an XML file?With the dynamic SDO API, you can use the XMLHelper interface to persist XML data to an XML file. The code under comment 7 calls the Util class for help, and defines the method as follows:
         public static void storeXML(DataObject data, String rootElementName,          String xmlFile) throws Exception {          OutputStream stream = new FileOutputStream(xmlFile);, PO_NAMESPACE, rootElementName, stream);     }

The persisted XML file for the purchase order is po_original.xml.

Working with the dynamic SDO API for XML can be so simple that this author believes it is much more convenient than the comparable DOM API.

The next section discusses how, with SDO’s dynamic API, you can modify the purchase order and track your changes at the same time.

Record Changes in Processing XML Data
You can take advantage of the ChangeSummary mechanism defined in SDO to track modifications and keep a record of changes together with the purchase order. To fulfill the second purpose, you obviously need to modify the XML schema file po_original.xsd. (Listing 3 highlights all the necessary additions.) The imported schema sdo.xsd (specified in SDO 2.1.0) defines ChangeSummaryType, among others. You add an element of this ChangeSummaryType in the PurchaseOrderType. You can name the element anything, but this example uses the name “changes.”

To apply this new schema, you need to load it in the definePOTypes() method of You will persist the corresponding XML file generated by as po.xml. Compared with po_original.xml, po.xml has a new sub-element of : . The class (see Listing 4) processes the purchase order. When run, the program persists the processed purchase order and produces a record of changes in po_processed.xml (see Listing 5).

Listing 5 shows the changes persisted in the XML file, but the more interesting part is the element, which is now much more complex. This element captures all the changes made to the purchase order in the program. (The actual contents of the element are specified by the SDO specification.) Based on the information recorded in the content and the modified data, an application will be able to reconstruct the original data if needed. (Listing 6 captures what the program prints to the console.)

Let’s review the class to analyze how SDO allows you to obtain such a detailed record of changes (refer to Listing 4):

  1. The line under comment 1 loads po.xml into the runtime.
  2. The line under comment 2 creates the ChangeSummary object associated with the purchaseOrder data object.
  3. To track changes, the line under comment 3 turns on the logging for the ChangeSummary object.
  4. From there until the line under comment 4, where the logging is turned off, all changes to purchaseOrder and its child data objects are captured in the ChangeSummary object, chngSum.
  5. The printout is produced by the line under comment 5, which calls the printChangeSummary() method of (see Listing 7).

In this approach, you first use getChangedDataObjects() to retrieve all the changed data objects, and then handle them based on which categories they belong to, whether created, deleted, or modified. If the data object is newly created, you print the information about this object and all its properties (by calling an annotated version of printDataObject(), an example provided in the specification). You also show the data object containing it, if one exists.

If the data object is deleted, you first identify its old containing object (if it exists), and then try to print it with printDataObject(). In this case, the printout indicates that the Apache Tuscany implementation does not generate anything. However, by calling the getOldValues() method of ChangeSummary with the deleted data object, you can examine all its properties and their values. Such information about the property of a changed data object is stored in the inner class ChangeSummary.Setting, which captures whether or not the property was set before and, if set, what the old value was (if available).

If the data object is modified, you can call printDataObject() to examine all its properties and their current values. Using getOldValues() of ChangeSummary, you can obtain the ChangeSummary.Setting objects corresponding to all properties. You carry out such an examination of the property in the private method printUpdatedProperty() of (see Listing 8).

The main part of the printUpdatedProperty() method first decides whether the property is a data type or a data object. If it is a data object, the method further checks whether the property is multiple-valued (such as an item) or single-valued (such as billTo). If it is multiple-valued, the method checks if the value is created, modified, or untouched and ultimately calls printDataObject. The method calls printDataObject() directly for single-valued properties of DataObject type. If the property is a data type, the method printUpdatedProperty() calls Setting.getValue() to retrieve the old value, and uses the method get(Property prop) of DataObject to obtain its current value.

The private method printDeletedProperty() of follows roughly the same logic. The difference is that no current value exists for any property of the deleted data object.

In the system console printout, the DataObject item is considered updated even though the program does not directly modify it at all. This is a result of all the modifications on its child item’s data objects.

The next section reviews the modified purchase order.

Review Changes Made by a Different Entity
This third phase of XML data processing should complete two tasks:

  1. The program prints out the major information in the modified order.
  2. It displays all the modifications made in the processing phase.

While you can further modify the purchase order as in the previous recording changes phase, this section shows how to revoke all the modifications made using (see Listing 9). The lines below comment 2 (Shows the information in the modified purchase order) complete the first task above, and they should be straightforward to you by now. The call to printChangeSummary() under comment 3 (Display changes made in the second phase) once again prints out to the system console the changes made in

Note that this second task is different from printing ChangeSummary in the previous section, where the printout is directly based on ChangeSummary as a Java object. Here, the change summary is reloaded into the runtime from an XML file. You can identify this difference by comparing the two printouts. In the previous one, printDataObject() generates nothing for the deleted item, while calls to printDeletedProperty() list all its properties and their values. In the current printout, the opposite is true. This discrepancy is probably a bug of the Apache Tuscany implementation (notice a second potential bug in the comments for in Listing 2). I hope this provides some evidence of how difficult it is to implement a rigorous change tracking mechanism.

In the rest of the program, the block under comment 4 (Undo all changes in the second phase) calls the undoChanges() method of ChangeSummary, and the block under comment 5 (Modify the purchase order on the original version) makes a simple change to this reverted purchase order. Look at the persisted XML (po_reviewed.xml) to verify the effects of these operations.

SDO Rising
You’ll find successful SDO implementations not only in leading open source and commercial software such as Eclipse, BEA AquaLogic Data Services Platform (ALDSP), and IBM WebSphere Process Server etc., but also in various languages including Java, C++, and PHP.

In 2007, two major developments occurred in the SDO ecosystem:

  1. In March, Open SOA announced it would submit SDO to OASIS.
  2. In April, the JCP formally declared it will consider SDO for inclusion in future versions of Java EE.

Together with the recent release of the SDO Java implementation from Apache Tuscany, these ought to be driving forces for its further adoption.

This article showcased only one feature of SDO, which offers many others that are just as powerful and useful. Explore this technology and you will discover other ways it simplifies the implementation of some common requirements for data management in the service domain.

Acknowledgement: The author would like to thank Laxma Reddy for reviewing a draft of this article and the example source code, and providing his valuable comments.


About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist