SOA: A Real Life Example
Documents are everywhere. Every process has some form of documentation to go along with it and that documentation needs to be managed. Managing these documents is a chore using traditional options, which include filing cabinets, manila envelopes, and even binder clips.
It has also long been true that people have looked toward electronic document management as a way to decrease costs and reduce errors when managing documents. Instant access to the electronic versions of their documents is also a must. Although this seems to be a rather simple idea, in practice, it's less than ideal, oftentimes creating new, distinct problems.
One of the biggest problems with electronic document management systems is getting non-electronic documents into the system. For example, if you have documents from a client who doesn't use the same electronic document management system as you, you still need to get his documents into your system efficiently and without pushing requirements back onto the source of the documents.
Although there are other needs within a document management system, such as indexing the documents efficiently or replicating a workflow for a document, for this example I will just concentrate on the inbound aspect of document management. Namely, how to move documents into your system and how to use an SOA help you do this.
While working for a document management startup, I came across a problem. The system in use was built to use a simple desktop scanner to send single document uploads from the users' desktops to their servers. Although this worked quite well initially, they were constantly being asked if they could handle larger document volumes. Clearly, I needed some way to scale up the document intake, as a single sheet scanner was not sufficient.
|If thinking that an SOA requires Web services is the biggest misconception, believing that SOAs are a new concept is the second biggest misconception.|
The company provided a service that they intended to sell to many different clients, and every client had different requirements for submitting documents. Some of these variable requirements included the source of documents (fax, e-mail, and high-speed scanners) to frequency of input (daily, periodically, and on demand).
The first solution attempted was a set of integration applications created to map each document entry point within the main system. Unfortunately, this resulted in an N times M solution, where N is the number of document sources and M is the frequency of input for each client. Clearly, this was undesirable and could not continue.
To solve this, I took a step back to take a reflective look at the problem and how it related to the existing system. The problem was not about how to integrate with each client, but rather the more abstract problem of how to input documents into the existing system efficiently.
Upon rethinking the problem, the solution became apparent. Inserting documents into the system was a service that should be provided. Once a document input service was defined, the only remaining problem was developing consumers of that service. These consumers could be built by my company or by the client.
This breakthrough came when I thought of the Web site as a separate isolated application from the larger document management system. The question became how to get documents into the system no matter who was the supplier of documents. For example, the system needed the ability to remotely upload documents using nothing but a Web browser.
That's when I realized that we could create a service that processes a package of documents. The package of documents could come from any source, and as long as the package conformed to what the service was expecting, we could process documents from any source. This simple re-architecture was the first step to creating an SOA. Now, instead of writing a direct pipeline into the system for every client, and another for every document delivery type, we just had to be able to create a package of documents and place it where the service could process it.
Creating a package of documents allowed unlimited flexibility in the aspect of processing that was the most volatile: the client's desired document input requirements. The package also allowed us to stabilize the least volatile aspect of the system, namely the insertion of documents into the system.
By separating the delivery and representation of the documents from the processing of those documents into the system, we were able to standardize the expected groups of inbound documents.
An SOA Example Implemented
To implement the architecture, we defined a flexible suite of services, many of which were published publicly. These services included:
- A Web service that accepted a packaged group of documents, including a metadata manifest describing the package contents
- A Windows Service that processed each package of documents, regardless of the origin of the package
- .NET Remoting objects that would also accept packages of documents, for faster processing. This service was used only for clients using.NET architecture.
- A Web service that could be used to track the status of documents submitted.
XML was used to construct the manifest contained within the packages of documents. Initially, the system only supported documents in PDF format, but in successive releases, Microsoft Office document support was introduced.
This meant that no matter where the documents originated, they could be packaged up, created in the relevant manifests, and submitted as a package to the servers. Some of the clients created to accomplish this included:
- A document upload Web page, allowing users to submit documents using nothing but the Web site
- A bulk uploading client that monitors directories and file shares on the client's computer. The client then packaged those documents and submitted them to the servers.
- An e-mail server integration client, allowing users to e-mail documents directly to the system.
- A fax server integration client, allowing users the ability to fax directly into the system.
- In the latter two client examples, those client components allowed our clients to have documents submitted directly into the system without intervention.
It's important to remember that the insertion of documents into the system was but one concern. There were other vertical slices that benefited from a provider and consumer model, including:
- Outbound document transmission
- Document workflows
- Document indexing and management
- Address book servicing