Understanding Transport Models
SOAP is not a completely transparent solution, and Web service developers must look under the covers of SOAP, at the lower-level transport mechanisms and models to inspect how they are implemented. In the simple case, the platform will automatically generate the code and SOAP messaging constructs for you, making it easy to develop Web services. Generally, this works when the developer doesn't have to do any customization of the service. When customization is required, often the developer works directly with the SOAP messages and possibly the XML content underneath. Thus, developers need to understand the SOAP and XML layers.
When designing your Web services interface, remember that the use of XML by itself does not guarantee interoperability. XML is not the
"silver bullet" for solving integration problems. There is still a need for businesses to communicate and agree on the vocabulary that will be used in Web service interactions. Just introducing XML into your architecture won't guarantee this. XML is simply a language or syntax for describing data. XML by itself does not provide the semantics for understanding the data. Spoken languages share many of same characters. However, an English speaker would not be able to understand text written in German. They share an alphabet (i.e., the syntax), but how the alphabet is interpreted (semantics) is different.
Even if you come to an agreement on meaning, it does not necessarily follow that you will share the same syntax, or language, with your partners. For example, you might decide that <order> represents a customer order while your partner decides to use <purchase_order>. In the real world, you usually cannot force a particular XML schema on your partners. So, what can you do to minimize this coupling? You can consider adopting a common internal format that will shield you from any external dependencies. In the case of the customer order, you actually might have more than one partner, each having its own XML representation. You probably cannot force a single XML format, so it's better to build your own internal format that can translate or transform to the partner formats. A technology such as XSLT could be used to perform this transformation. Some Web service platforms provide mechanisms, either at the gateway, Web server, or application server, to set up XML mapping rules between XML documents that are received and the internal representations used.
DOM versus SAX
Many Web services runtime environments shield the developer from having to worry about lower-level XML parsing and processing. However, in situations where additional flexibility is required or where performance is critical, it might be important for the developer to write custom code or handlers that involve the processing of XML documents. In these instances, choosing between the two XML parsing models, DOM and SAX, will be a critical design decision (see Figure 2).
|Figure 2: DOM versus SAX. The graphic shows key differences between the two XML-parsing document models.|
DOM uses a tree-like approach for accessing an XML document while SAX uses more of an event model. With a DOM parser, the XML document is converted into a tree containing the XML content. Subsequently, developers can traverse the DOM tree. There are a number of benefits to this approach. Generally, it is easier to program. A developer simply has to make one call to create the tree, and the navigation APIs are fairly easy to use. It's also fairly easy to add and modify elements in the tree. A common use for DOM is when the XML document is being changed frequently by the service.
However, there are a number of disadvantages to using DOM. For one, when you parse an XML document using DOM, you have to parse the entire document. So, there is an up-front performance and memory hit in this approach, especially for very large XML documents.
SAX uses an event-based model for parsing XML. It works through a set of events that get fired as the XML is parsed. When a given tag is found, a callback method is invoked to indicate that this tag was found. Generally, the SAX approach reduces the overall memory overhead because it is left up to the developer to determine which events they want to handle. SAX can typically increase scalability if you are only processing subsets of the data. But, SAX is more difficult to program than DOM, and the SAX approach doesn't work well if you are required to make multiple passes over the same document.
Generally, DOM is appropriate when you are dealing with more document-oriented XML files. SAX fits better when what you locate maps directly to other objects in your system (e.g., mapping from an XML structure to a Java class) or when you want to extract specific tags from your document.