DevX HomePage

Passing files to a Web service

In this tip, Benoît Marchal discusses the different solutions available for passing binary data (typically files) to a Web service.

First published by IBM at http://www-106.ibm.com/developerworks/xml/library/x-tippass.html

The evolution of Web service protocols has gone from supporting very simple requests with simple parameters to fully supporting modern, object-oriented languages. XML-RPC, arguably one of the earliest forms of Web services, only supported simple types—strings, integers, booleans, and the like. SOAP took this one step further with its encoding rules for objects. The last step—improving on the binary—came with SOAP with attachments.

SOAP with attachments was originally introduced as an extension to SOAP 1.1, and it is supported by the major SOAP kits. Although SOAP 1.2, the official W3C release, does not support attachments yet, work is under way to include them in the (ideally) near future.

Web services and binary data
I have little doubt that XML's success in application integration comes from its reliance on a textual encoding (as opposed to binary protocols such as CORBA, an object-oriented RPC standard, or RMI, a Java-specific RPC standard). Textual encoding is preferable for several reasons, the most critical of which may be that it is easier to debug and easier to roll up a special implementation when the need arises.

Still, the reliance on textual encoding has a darker side, and XML offers no efficient solution for including binary data. According to the W3C XML Schema specification, binary data should be encoded in base 64 or hexadecimal. Unfortunately, 64-encoded data is 50% larger than non-encoded data. Hexadecimal encoding doubles the size. This overhead is acceptable for small pieces of binary data, but it is clearly an issue for larger sets.

Binary data is useful in many applications. For example:

While it is possible to create XML versions of these file formats (similar to SVG for vector graphics), binary data has been around for a long time and will likely remain popular.

Finally, there is the issue of XML itself! It is not trivial to include an XML document inside another XML document (the syntactically correct solution relies on CDATA sections and character escaping).

MIME and base 64
To clear up a source of frequent confusion, MIME does not mandate base 64 encoding. Specifically, HTTP implementations do not encode attachments; only mail clients encode attachments to work around limitations in SMTP (so there's no gain when compared to XML).

To address the needs of all these applications, Web services must support binary data efficiently. The proposed solution is SOAP with attachments which, in a nutshell, removes binary information from the XML payload and stores it directly in the HTTP request as multipart/related MIME content.

Your options, when designing a Web service that works with binary data, are: