Spreadsheet Programming: MS Excel as Component Development Environment

preadsheet programming is a natural evolution of component-oriented development. Its basic philosophy is to use MS Excel as a development environment for creating business logic components. These components can then be orchestrated to create vastly complex and sophisticated business processing systems that are still easy to manage and maintain. Furthermore, because the business rules are developed in Excel, business users?who typically drive system requirements?can easily review, verify, and even modify them.

Keep in mind that Excel is used only as a development environment. At runtime, spreadsheet engines can execute the business components without Excel, allowing for a high degree of scalability. Several vendors offer products that enable this technique on Java and .NET.

Spreadsheet programming is a natural evolution of component-oriented development.
This article explains how to begin leveraging spreadsheet programming for higher productivity. It covers the design, building, testing, and maintenance of applications with spreadsheet programming. It also provides the full source code for a sales tax calculator component built in Excel and deployed as either a Java applet or an ASP.NET page.

MS Excel as Business Component Development Environment
An Excel workbook consists of one or more worksheets, each of which contains many rows and columns of cells. Each cell can contain data (a number or a string) or a formula. Cells can be grouped into ranges containing one or more cells, and ranges can be named and referred to by name.

Under the paradigm of spreadsheet programming, a workbook corresponds to a business component, a cell containing data is an input to the business component, and cells with formulas make up the output behavior of the business component (see Table 1).

Traditional Component-oriented Development Spreadsheet Programming
Component Workbook
Public input data member Data Cell that is populated at runtime through the API
Public output method Formula Cell that is queried at runtime through the API
Private method Formula Cell that is not queried at runtime
Private constant data member Data Cell that is not changed at runtime
Table 1. Component-oriented Development Analogies in Spreadsheet Programming

Cell formulas in Excel can be composed of Excel’s built-in Functions, which include basic math operations as well as about 300 other functions. Some of the most useful functions for building business components are: IF, OR, AND, LOOKUP, OFFSET, COUNTIF, SUMIF. Experiment with these functions if you’re not familiar with them.

At design time, you work in Excel to interactively develop and test the business component’s functionality. But Excel cannot scale to perform calculations on a server for multiple clients. This is where spreadsheet engines come in. At runtime, your application communicates with the spreadsheet engine via an API, input parameters are passed in via a 'SetCellValue(. . .)' method, and the results of cell formulas are retrieved with a 'GetCellValue(. . .)' method.

Several powerful design patterns can be leveraged when using Excel to develop business components. This article discusses them further, but first jumps right into developing a simple business component with spreadsheet programming.A Simple Business Component
This article’s sample app relates to sales tax calculations (click here to see the spreadsheet programming sample app with source code). Currently, mail-order and Web-based retailers are not required to collect state sales tax for orders shipped to states where they don’t have a “substantial physical presence” (called Nexus). However, imminent legislation apparently will compel such retailers to correctly asses and collect sales tax for all states to which they ship, whether they have nexus or not.

Lets first examine a very basic tax calculator and then move on to a more sophisticated one. Figure 1 shows how a basic tax calculator can be developed in Excel.

Figure 1: Spreadsheet with Sales Tax Calculations

The following is the pseudo code from the application. It calls the tax component to pass the transaction data (input data) into the spreadsheet:

// Pass the Subtotal, State, and Nexus data into cells B3, B4, B5TaxSheet.SetValue(3, 2, 250.29); // SubtotalTaxSheet.SetValue(4, 2, AR); // StateTaxSheet.SetValue(5, 2, true); // Nexus indicator

and retrieve the result (output data):

// Get the tax from cell B7double tax = taxSheet.getValue(7, 2); // Calculated Sales Tax Amt.

In this simple example, a single cell formula in cell B7 calculates the tax rate:

"=IF(B5, LOOKUP(B4, E3:E56, D3:D56)*B3,0)"

This is a highly simplified example. It basically implements a lookup table in Excel rather than in code. The input cells (B3, B4, and B5) and the output cell (B7) act like the public API of the tax component. The other cells act like private members of the component. As is the case with all component-oriented development, as long as the pubic API doesnt change, the private members in Excel can be continually enhanced and updated without having to change the code of the caller. You can further isolate changes by passing values in and out of the spreadsheet by name instead of by Row,Col. So the pseudo code would look something like the following:

TaxSheet.SetValueByName("SalesSubtotal", 250.29); // Subtotal

Now you can move your cells around without having to change code. Using named cells also makes your cell formulas a lot more readable.

The collection of input cells and output cells make up the public API of the spreadsheet business component. The formula cells in the hidden layer of intermediate calculations are analogous to private methods of the business component.
Now let’s take a look at a more realistic example. In some states, different types of purchases are taxed at different rates. For example, food, liquor, clothing, and automobiles may all be taxed at different rates depending on the ship-to state. Figure 2 shows a tax component spreadsheet that accommodates multiple items and different tax rates based on the type of the item. Figure 3 shows the state tax data for multiple tax types.

Figure 2: More Sophisticated Tax Calculator Component
Figure 3: State Tax Data for Multiple Item Types

Using these spreadsheets allows you to determine the line item taxes and totals all at once, rather than item by item. Now assume that you have to make this component available as a Web service that enables clients to submit a complete customer order and receive the results in one pass. If you used Knowledge Dynamics’ KDCalc, you could employ its ProcessXML function, which allows multiple input cell values to be passed in and multiple results to be returned as XML. Listing 1 shows the pseudo code for this function. The return value, sResult, is an XML doc that contains all the requested output cell values.

Now a multi-line order can be processed all at once. More importantly, if the tax rates or the tax rules change, users can update the spreadsheet without changing any code. For example, some states now charge a luxury tax for autos that cost more than $50,000. To accommodate this, the tax expert would alter the cell formulas and the state tax table, but the Web service wrapper code would not have to change. Building and maintaining the tax logic in Excel takes a fraction of the effort that coding it by hand does. The relative efficiency improves even further for more complex business logic.

Now that you’ve seen a simple application in action, take another look at some of the important considerations for making the best use of spreadsheet programming.Design a Spreadsheet Business Component, Then Build an Application Around It
The most critical design task in spreadsheet programming is identifying the appropriate input cells and output cells. This defines the API for your spreadsheet. Once the API has been defined, the technical design of the calling application and the underlying spreadsheet logic can proceed independently. Using spreadsheet cell arrangements and formulas, you’ll find many design patterns for achieving different kinds of logic.

When building a spreadsheet-oriented application, you focus your effort in two areas: the spreadsheet and the caller application. The spreadsheet will contain the business logic, and business people can develop it. The caller application just needs to pass the right data in and out of the spreadsheet.

Testing and Maintenance
Testing is an area where spreadsheet programming really shines. Testing is required to verify two areas: (1) the business logic in the spreadsheet and, (2) the communication between the calling application and the spreadsheet. To test the business logic in the spreadsheet, first look at the input cells and the result cells, or the public API. You can create a separate worksheet that contains hundreds or thousands of rows of test data sets and expected results. With a single, generic VBA function, you can cycle through the test data, retrieve the results, and compare them with the expected results. This automated testing delivers tremendous efficiency.

The next step is to test the communication between the calling application and the spreadsheet to ensure that the correct data is moved in and out of the spreadsheet at the correct cell locations. You can do this in a variety of ways, depending on which spreadsheet engine product you use. KDCalc, for example, includes a Spreadsheet Bridge component that inserts a tabbed spreadsheet grid between your calling application and the KDCalc engine. So as your app is passing data in and out of the engine, you can peek in to see what all the values are. This method of visual debugging is much faster than stepping through the API code one line at a time because you can see all the values at once.

Maintenance of spreadsheet-oriented applications is another area with powerful efficiencies. As long as the API between the calling application and the underlying spreadsheet stays constant, the caller and the spreadsheet can be updated independently. Maintaining business logic in a spreadsheet is more efficient than maintaining code. Taking control of a spreadsheet that someone else created is also far easier that taking over someone else’s code.

Kinds of Applications Suitable for Spreadsheet Programming
Sales tax calculations are an obvious fit for spreadsheet programming, but they aren’t the only ones. Some other obvious fits are payroll tax deduction modules for HRM systems, calculating shipping costs based on weight, delivery method, and delivery schedule, and account transactions in financial services.

Literally anything that involves business decisioning or calculations can be built faster and more accurately with spreadsheet programming.
Spreadsheet programming is also a perfect fit for a lot of other applications that may not be so obvious. Some of these are insurance and loan underwriting, dynamic pricing calculations based on order composition and inventory levels, and pricing and trading of custom financial instruments. Literally anything that involves business decisioning or calculations can be built faster and more accurately with spreadsheet programming.Spreadsheet Programming Products
Several commercial spreadsheet engines support spreadsheet programming (see Figure 4).

Formula1, Actuate Corp.
Formula1 is a full-featured Java spreadsheet component. It reads and writes Excel files, supports Excel formulas, and has grid views, graphs, and reporting. Because Formula1 is all Java, caller applications use it just like any other third-party Java class.

JIntegra, Intrinsyc Corp.
JIntegra is a bridge that communicates to Excel through DCOM, a proprietary Microsoft remote communication protocol. JIntegra presents a Java API to the calling application, sparing developers from having to write Excel integration code themselves.

KDCalc, Knowledge Dynamics, Inc.
KDCalc compiles Excel spreadsheets into a compressed executable file. It supports Excel formulas but has no visual interface, so it’s primarily used as a calculation engine or in cases where the developer builds a custom UI in lieu of a spreadsheet metaphor. KDCalc is available for Java and .NET; KDCalc for Java is pure Java and KDCalc .NET is 100 percent managed code. Calling it is like calling any other class.

Figure 4: Comparison of Products That Support Spreadsheet Programming

All of these products provide an API that allows Java or .NET callers to access the spreadsheet features. All these products use the basic development process shown in Figure 5 to facilitate spreadsheet programming.

Figure 5: The Spreadsheet Programming Development Process

A Powerful Technique with Almost No Learning Curve
Spreadsheet programming is a powerful technique that has virtually no learning curve, instantly and dramatically improves software development productivity, and makes complex logic transparent and accessible to the business users who drive requirements. It should be a part of every developer’s toolkit.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Related Posts