First published by IBM at http://www-106.ibm.com/developerworks/xml/library/x-tippass.html
The evolution of Web service protocols has gone from supporting very simple requests with simple parameters to fully supporting modern, object-oriented languages. XML-RPC, arguably one of the earliest forms of Web services, only supported simple types—strings, integers, booleans, and the like. SOAP took this one step further with its encoding rules for objects. The last step—improving on the binary—came with SOAP with attachments.
SOAP with attachments was originally introduced as an extension to SOAP 1.1, and it is supported by the major SOAP kits. Although SOAP 1.2, the official W3C release, does not support attachments yet, work is under way to include them in the (ideally) near future.
Web services and binary data
I have little doubt that XML's success in
application integration comes from its reliance on a textual encoding (as
opposed to binary protocols such as CORBA, an object-oriented RPC standard, or
RMI, a Java-specific RPC standard). Textual encoding is preferable for several
reasons, the most critical of which may be that it is easier to
debug and easier to roll up a special implementation when the need arises.
Still, the reliance on textual encoding has a darker side, and XML offers no efficient solution for including binary data. According to the W3C XML Schema specification, binary data should be encoded in base 64 or hexadecimal. Unfortunately, 64-encoded data is 50% larger than non-encoded data. Hexadecimal encoding doubles the size. This overhead is acceptable for small pieces of binary data, but it is clearly an issue for larger sets.
Binary data is useful in many applications. For example:
While it is possible to create XML versions of these file formats (similar to SVG for vector graphics), binary data has been around for a long time and will likely remain popular.
Finally, there is the issue of XML itself! It is not trivial to include an XML document inside another XML document (the syntactically correct solution relies on CDATA sections and character escaping).
|
MIME and base 64 To clear up a source of frequent confusion, MIME does not mandate base 64 encoding. Specifically, HTTP implementations do not encode attachments; only mail clients encode attachments to work around limitations in SMTP (so there's no gain when compared to XML). |
To address the needs of all these applications, Web
services must support binary data efficiently. The proposed solution is
SOAP with attachments which, in a nutshell, removes binary
information from the XML payload and stores it directly in the HTTP
request as multipart/related MIME content.
Your options, when designing a Web service that works with binary data, are:
Listing 1 is a SOAP request with a base 64-encoded parameter. Note the address element.
Listing 1. base 64-encoded parameter
|
Implementing attachments
Attachments are available to Java developers through both JAX-RPC (the Java API for
XML-based RPC) and SAAJ (SOAP with Attachments API for Java). Don't
let the SAAJ acronym fool you: JAX-RPC supports attachments (see Resources
for an example). The difference between JAX-RPC and SAAJ is the level of
abstraction, not the capabilities.
JAX-RPC is a high-level API that's more abstract than SAAJ.
It hides most of the protocol-oriented aspects of SOAP
behind an RMI layer. The developer works on Java objects and the
pre-processor turns them into SOAP nodes. JAX-RPC uses the java.awt.Image and javax.activation.DataHandler
classes to represent attachments.
SAAJ is closer to the protocol. It takes more work to create a SOAP message with SAAJ than with JAX-RPC (and furthermore it offers no automatic link to WSDL), so in most cases you will want to use JAX-RPC. Still the low-level aspects of SAAJ make it more suitable for illustrating how attachments really work. Listing 2 is a SOAP request with an attachment. The request asks the server to resize a photo; because photo files are large, an attachment is more efficient.
Listing 2. Attachment parameter
|
Listing 3 illustrates the creation of the SOAP request. The request asks a server to resize an image. The procedure is as follows:
DataHandler object.source
and percent).href
attribute. The attachment is referred to through a cid
(content-id) URL.The service replies with the resized image, again as an attachment. To retrieve it, you can test for a SOAP fault (which indicates an error). If there are no faults, retrieve the attachment as a file and process it.
Listing 3. Using SAAJ
|
Note that Listing 3 makes it clear that the attachment is outside of the XML message! This is necessary for efficiency.
Speaking of efficiency, take a look at Listing 4, which illustrates the more common (and dramatically
shorter) JAX-RPC version of Listing 3. The JAX-RPC precompiler generates
a stub that greatly simplifies coding.
You pass a DataHandler object as a method parameter and
JAX-RPC automatically generates the attachment.
Listing 4. The more efficient JAX-RPC
|
Conclusion
Choice is good, and SOAP gives you a choice when working with binary data: You
can either encode it as base 64 within the XML payload, which is good for
small datasets, or you can attach larger binary files, unencoded, to the request.
Resources