Existing Solutions vs. Generic, Reusable Marshalling
One of the most common solutions to marshalling large object graphs to XML is to use a low-level library such as StAX
(the streaming API for XML) directly, or even to resort to string concatenation (if you're looking for trouble). The StAX event-based API allows an application to generate individual events for each component of the XML document, and stream these events directly to an output stream such as a file or network socket. While this solution works, it carries a lot of baggage. The XML generation would need to be rewritten for each type of object to be transformed, and the implementation would be very tightly coupled to the XML technology selected. With many of these low-level libraries, the application is responsible for data conversion into and out of the XML type system. Finally, the amount of code required may be very large, thereby creating a debugging and maintenance challenge.
Another approach is to write a custom OXM marshaller with a standard interface for each object that needs to be transformed into XML. This technique is similar to an implementation I described in a previous DevX article, "Use the Best of StAX and XMLBeans to Stream XML Object Binding." A custom OXM marshaller has the advantage of abstracting the transformation from the application and leveraging the type mapping support of a library such as XMLBeans or JAXB. However, the pattern is still insufficient because it requires a new marshaller implementation for each object type. At the same time, segmenting the XML, as described in my previous article, can cause performance degradation when processing extremely large documents.
The ideal solution is a library that supports marshalling any object to XML, requires little code, is maintainable, and has reasonable performance. The new marshalling pattern that this library would use must satisfy a number of core requirements. It must be:
- Generic: The marshaller must support marshalling any object that the underlying marshaller implementation can understand. For example, with JAXB, this would be any annotated object. For XStream, this would be any JavaBean.
- Reusable: The marshaller must be thread safe and reusable so it can be used in a stateless system such as an enterprise application or servlet. This allows for easier configuration and testing using a dependency-injection framework such as Spring.
- Scalable: The marshaller must support objects of any size when marshalling and unmarshalling.
- Reasonably high performing: The marshaller should not be much slower than using the OXM library directly. For example, if JAXB is the underlying implementation, the generic, streaming marshaller must perform just as well as using JAXB directly to write the object.
Consider a possible scenario where this solution could be applied. Suppose a company has to archive orders in XML. The orders currently exist in a database. Each order may contain hundreds of items. The history of users who created the orders must also be archiveda record of all the actions they performed on the system. This information exists in data access objects (DAOs), which can return the information in pages (i.e., order items 1 to 10, 11 to 20, etc.). Loading the entire order or the entire user history into memory isn't feasible given the hardware specifications, especially if this process needs to be done in parallel. Ideally, a single OXM library would be used to transform both the orders and the user histories to XML in an efficient manner.
Using these requirements and this scenario, the following section explains how I designed and implemented a generic, reusable, marshalling library.