ithout a doubt, Web services have hit the mainstream and have demonstrated their value. They were designed largely to address interoperability and distributed computing and both goals were realized by layering technologies over existing implementation environments. Web services have not only changed the way that applications are built, but more importantly, they have enabled entirely new classes of solutions. For the first time we have foundations that enable the construction of applications that allow participation from individuals and computing systems in different organizations, operating within a diverse set of IT environments.
As game changing as they are though, Web services aren’t always enough. Sure, if you are offering a stock ticker service then you likely needn’t address anything beyond platform independence and remote invocation, however, if you are building Enterprise Content Management (ECM) services and applications you must leverage the paradigms and tools that are coarsely categorized as service oriented. ECM systems capture, store, secure, and manage the documents and other content assets of enterprises so that information can be found, retrieved, delivered, and retained when needed.
Regardless of industry, all organizations and corporations, from small to very large, have content management needs that are fulfilled with some type of automated solution. The effectiveness of such solutions has been repeatedly proven over the last 20 years. In the early part of this history a content-centric solution was generally built over a single content repository and this resulted in ECM architectures that tightly coupled the storage, management, and delivery functionalities. The content management needs have changed and today the user requires solutions that span multiple content repositories. For example, an insurance claims processing application will draw together accident reports from one system, customary repair costs from another, and allow a claims adjuster to produce documentation that is stored in a third; coordinating regulated process over this heterogeneous environment requires a richer development and deployment infrastructure than what was previously available. A service-oriented approach fits the bill.
While there are many elements of service orientation that could be applied to this complex solution space, there is really only a handful that stands out as the foundation of service-oriented architectures (SOA) for ECM. To build an ECM system interface you can’t just start writing Web services; instead, you have to be aware of interoperability constraints, take advantage of the WS-* and other Web services-related specifications, and create service interfaces that can be utilized from a number of different application development environments, including graphical tools targeted at a new class of developer. This is far more than a simple technological change. In this article, I’ll discuss these broader shifts and explain how a careful and deliberate approach to planning an ECM services implementation will result in a more efficient and flexible system that delivers ROI.
SOA Adds Agility
Enterprise Content Management systems have been indisputably successful over the last 15 to 20 years demonstrating significant ROI and enabling solutions that were previously impossible to achieve. And while steadfastly dependable, in many cases the systems have become rather large and difficult to upgrade; the average time between major releases across the largest ECM vendors can be measured in years rather than months, and customer upgrades to new releases require lengthy planning cycles. SOA affords an opportunity to change this.
Web services that are decoupled both from other services as well as from the content management system itself will modularize ECM capabilities allowing for fundamental units to be independently upgraded. For example, imagine that a Web service that generates a PDF rendition from a document checked into the repository has recently been optimized. Using SOA, this Web service can be tested and deployed without changing services such as check-out and check-in. This avoids the expensive and time consuming, yet often mandatory, recertification of these core ECM library services.
This agility, however, does not come for free by simply offering Web service-based interfaces. Rather, the design of the Web services must deliberately focus on meeting this objective. The mechanics of recording audit history upon document check-in is critical in regulated environments; therefore, in the example above, audit history recording must be sufficiently isolated from the activities of a new rendition generating service?the level of tolerance for interference is zero.
Planning for agility does not apply only to the construction of the Web services themselves; it also involves anticipation of incremental evolution of the services infrastructure?that is, where you’re taking your ECM environment in the future. The environment in which Web services operate can be as simple as an application server that services HTTP requests, or it may include a vast array of other components such as single sign-on, message queues, UDDI directories, and an enterprise service bus (ESB).
Content management Web services will be deployed within ecosystems that range over the entire spectrum from basic to sophisticated and will be expected to continue functioning as the services infrastructure is incrementally upgraded. When initially deployed, for example, ECM services may be invoked with a username and password for authentication, however, an enterprise SOA upgrade may allow connection to the same ECM services with a single sign-on token. Again, the ECM services must be designed to anticipate such environmental changes.
Core WS Standards for ECM Should Be Mandatory
Getting content, say from a user’s desktop, into the repository, ensuring it is not tampered with while in transit and securing it once stored, are two of the basic concerns for an ECM system interface. No one wants their employee performance appraisal accessible by any of their colleagues; ECM systems provide access controls that require a user to identify themselves, often with a username and password, before access is granted to managed content. Fortunately there are Web services standards that offer solutions.
Security, both at the point of access and while content is in transit is addressed through the WS-Security standard. WS-Security extends SOAP to address message integrity and confidentiality and also defines a mechanism for the inclusion of security tokens within the SOAP envelope.
WS-Security does not provide a security solution, rather it defines a means by which Web services and security solutions are to cooperate in the SOA. In order to ensure that security is soundly addressed it must be an infrastructure component with which all business critical applications, including an SOA-ready ECM system, must interact. With the availability of such infrastructure elements then, the Web services developer needn’t reinvent mechanisms for message integrity and access control, rather the developer should design services that utilize WS-Security; reuse the capabilities implemented in Web service runtimes such as Axis and Systinet.
Base64 takes each byte from an unstructured binary stream and maps it to a character suitable for inclusion in an XML document. And while this approach clearly suffers from inefficiency concerns, its simplicity and ubiquity make it suitable in situations where the encoded payloads are small and infrequently passed (e.g. human-based processes such as expense report submissions). MTOM, which was created to optimize even standard XML via SOAP, addresses efficient transfer of binary content with serialization of multi-part/MIME content and is the transfer mechanism of choice for batch uploads (e.g. automated document scanning) or large binary files (e.g. CAD drawings). Your choice of content transfer mechanism should predominantly be driven by the application scenario, how frequently content is uploaded or downloaded, and the relative sizes of the payloads.
Finally, to ensure platform independence for the ECM system the WS-Interoperability Basic Profile (WS-I BP) specification, which tightly constrains the use of SOAP in support of interoperability, should be supported. WS-I BP also places more stringent requirements over the use of WSDL, eliminating certain ambiguities that could present interoperability challenges. Providing ECM Web services that meet the requirements of WS-I BP 1.1 will ensure that they can be deployed on a variety of platforms and be consumed by applications running on similarly varied platforms.
The Application Developer for ECM in the SOA
The developer of the content-rich application does not fit a single profile. While historically most have been programmers (Java or .NET) who are at least one level removed from the business user whose concerns they are addressing, recent innovations are making application development directly available to the business analyst.
Consider first the business analyst. This “developer” will use technologies such as business process management (BPM) tools or portal systems to build and configure the applications they construct. These tools provide intuitive, often graphical, drag-and-drop, non-coding interfaces to facilitate the composition of services into new targeted applications. The provider of ECM services must carefully consider two main things to ensure that their offerings can be utilized by this new breed of developer. First, the services offered must expose a level of detail and control that is consistent with the needs of the business analyst and this means a coarser granularity than what is typically offered the serious programmer. Instead of offering numerous interfaces for tasks such as creation of an object in the repository, identification of an object type, assignment of contextual labels, and making the content available to other ECM users, for example, a single, simple interface for uploading content into the repository is called for. The second concern for the service producer is that the BPM and portal-based tools require services to be described in a standardized form. Fortunately, the Web Services Description Language (WSDL) is already in widespread use and likely something the Web services developer is already well familiar with.
|Figure 1. The client library searches HTML content for embedded graphics, packages all binaries in a single SOAP message, and dispatches that message to the service endpoint. (Note: service endpoint depicted with conventions introduced by David Chappell.)
There is, of course, still a need for traditional programmers and providing them with the tools to efficiently construct and deliver powerful ECM systems should also be a primary concern of the services provider. To address platform interoperability a common approach in the SOA is to depend entirely on the services themselves for the reusable components of an application. Using an IDE to generate proxies from a WSDL definition does bind the application to the platform at the right time, however, the client-side classes that are generated as a result are reduced to facilitating message transfers. The unfortunate consequence is that we’ve essentially ignored the need for implementation reuse in client side processing?we aren’t providing services that run on the client, yet sometimes this is exactly what we need.
From ECM, for example, let’s consider the example of managing HTML content. Even in the simplest sense a typical Web page is really a collection of documents, the main HTML page plus files containing the images on that page. When an application developer wishes to provide their user with a capability to upload a Web page into a repository they have to write code that 1) scans the HTML file in search of tags, 2) collects the HTML file and those image files and 3) packages them for transfer, all before they can use the webPageImport service you provide. Rather than requiring each application developer to produce their own implementation of this functionality, which is arguably part of content management, I suggest you provide them a reusable implementation in the form of a client-side library (see Figure 1). That being said, I caution you not to overuse this approach as you will have to produce libraries for each of the client-side platforms independently, sacrificing interoperability. However, in most domains, such as ECM, true need is minimal. The key is to balance the usefulness of such libraries against the platform interoperability needs.
ECM and SOA: Made for Each Other
The complex world of ECM is a textbook example of the important benefits that are conferred by a service-oriented architecture?the benefits can be huge. Through the creation of ECM services, large, complex systems will be more effectively deployed, configured, and maintained. Further, ECM functionalities will be more easily integrated with other core business functions such as ERP and CRM systems.
Application of service orientation concepts to the ECM architecture must be carefully thought out and deliberately applied so as to meet the specific needs of the content-centric enterprise. Service orientation is not simply about Web services, interoperability, and distributed computing. Services design goals should include the both enabling a more fluid, adaptable IT ecosystem as well as functioning within such an ever-changing environment.
Building services around the vast array of standards and standards-based tools available today will produce truly interoperable and widely reusable system components. And finally understanding that the traditional applications programmer?as well as a new breed of application developer, the business analyst?are your target consumers will broaden your impact. For those who are tasked with providing ECM capabilities, regardless of the industry, service orientation and the Web services-based standards compliance will provide the tools for future success.