olving the same problems over and over again can be quite tiring for a software engineer, yet the object persistence wheel has been reinvented more times than I'd like to count. Thankfully, the industry is centralizing on XML because it can represent object relationships very well, and it is architecture and language-agnostic by nature. XML is now at the backbone of most client/server applications, from XHTML to SOAP web services to RESTful services, preferences, persistence, and configuration.
However, even with the advent of XML, mapping from objects such as Java class instances to XML is not always trivial. In particular, using a "contract first" approach that defines an XML schema, namespaces, and XML data types can be arduous. Object-to-XML mapping (OXM) libraries such as JAXB, XMLBeans, and XStream have made OXM easier, and they've helped to define APIs that serve as the foundation of serialization for tools such as Spring Web Services (Spring-WS). These libraries work by generating classes or using Java annotations to map the objects to a defined XML schema automatically. So the application developer simply instantiates an object, populates the data, and tells the library to marshal the object to well-formed XML. The process then works in reverse when the application unmarshals the XML back into objects by feeding the XML into the library.
A common architectural pattern for applications using OXM involves defining a generic marshaller and unmarshaller interface. This approach hides the actual marshalling technology behind the interface to enable multiple implementations, easier mock testing, consistent exception handling, and future flexibility. However, these interfaces, combined with the hierarchical nature of XML, require that the entire object to be marshalled live in memory, which becomes a major hindrance when the objects grow too large. For example, how do you marshal a web server access log that may contain tens of thousands of requests, or how do you marshal a security audit log for multiple users on the system simultaneously without exhausting memory?
A better solution (albeit with a few limitations) is the callback approach, which solves many of the problems found with existing solutions for large object XML serialization. Using a callback-based API allows an application to "stream" objects in and out of XML while still gaining all the benefits of OXM, such as namespace handling, element-to-field mappings, and data type conversions. This article explains how the callback approach solves problems such as direct low-level XML handling and custom OXM marshallers.