Loop into Developer Productivity
To significantly boost developer productivity, tools must automate the collection and dissemination of runtime information. The trouble is, most application development tools today focus on the forward direction of the development process, i.e., helping developers move from requirements to design, or from design to construction. With control loops, however, it is the backward flow of information that affects the business's bottom line. Data from the runtime use of Web service applications must be collected, filtered, analyzed, and sent back into the business analysis and development processes.
To accomplish process reversal, instrumentation must be added to runtime services and their associated components. For example, in the long-distance rate-plan Web service, the Web service component will need to execute a set of transactions to corporate database systems to compute current rates. These transactions are a critical element of the processing for the Web service. Although they may look like a black-box component from a source-code perspective, they must be viewed as critical infrastructure from the testing perspective. If a transaction were to block for a considerable amount of time, perhaps because the database was locked, then the entire Web service could become inoperable and useless to a customer.
|When a developer has this ability to quickly find a problem in a complex distributed service, he can subsequently focus attention on how to replicate problems.|
To solve this problem, tools must be able to automatically discover transactions and their components. Not only must transactions be monitored, their internal behavior must also be traced so that a transaction map can be created. This map describes the transaction's progress through network elements and servers. Furthermore, timing information should be collected on each element, to reveal exactly where a fault or block is occurring. Finally, any unusual events must be recognized and called out. Unusual transaction events might include inability to access a database server or access-permission problems.
When a developer has this ability to quickly find a problem in a complex distributed service, he can subsequently focus attention on how to replicate problems. Certainly, errors occur when customers use a service differently than developers expected or when unexpected circumstances occur. Our experience at HP suggests that developers generally simulate error-prone uses of their application instead of waiting for errors to occur. Part of the reason for this approach, obviously, is to design test plans. For example, developers might suggest testing to learn if a system can handle hundreds of thousands of transactions at once. Simulating such error-causing circumstances is much harder than allowing them to occur naturally, and it forces developers to write additional code for the simulations, and that has a dramatic impact on productivity.
|A mechanism to match service-level agreements with deployed services would reduce errors and customer complaints.|
What is needed is a tool that can assist with Web service testing, both from the service perspective and from the customer perspective. From a service perspective, transactions must be automatically generated that can mimic the behavior of specific circumstances, such as high volumes. From a customer perspective, client-oriented tools must be able to capture and reproduce a usage flow through a Web site, from logon to exiting, just as if a customer had browsed the site.
With both of these tools in place, developers can design a complete range of tests to simulate situations where the Web service infrastructure might fail or become intolerably slow. Along with simulation capabilities comes the need to monitor and record the infrastructure activity during both testing and production use to identify the source of problems. For example, it would be useful to identify what types of customers are having trouble, such as those in certain geographies or those using a particular connection type or ISP. Another useful measurement is to compare Web site performance to Web server performance. This can indicate the server components are not properly distributed across the network infrastructure.
Another area where bridging the gap through operational monitoring impacts developer productivity is in Web service creation. A mechanism to match service-level agreements with deployed services would reduce errors and customer complaints. For example, in the long-distance rate-plan service, many network-infrastructure elements are required to execute the entire service, such as network storage for a database, gateways to internal and external networks for multi-vendor rate calculations, and internal Web servers for adequate performance.