Starting with Office 2000, Microsoft began migrating their document structure types away from proprietary binary formats and towards XML. With OfficeXP the process was completed for Excel and greatly matured for the other applications.
This change creates an opportunity for software developers to change the paradigm of how they deliver content. To demonstrate this potential change, this article uses an example of a workflow situation at a financial services software house.
How It Is Done
Typically, when you’re putting together a server-based application for the delivery of financial or business information, you work with a business analyst to determine what data is delivered by the application and what data is produced by the application. The production of new data is generally called the application of business logic.
In my experience, the tool of choice for business or financial analysts is Microsoft Excel. Excel allows analysts to easily mock up how they would like the data to appear and has a powerful formulae editor that allows them to demonstrate their desired calculations. The mock-up is typically passed on to you. You would then use a server side technology such as ASP or JSP to retrieve the data from a database, apply any business logic and present it as an HTML page. This page, would of course be only an approximation of the original mock-up due to the constraints of HTML.
Many developers note the irony that a follow-up requirement for their applications is that they provide some form of export to Excel so that end-users can further manipulate the data (with Office XP and Internet Explorer 6, a user may now right click on any page and select ‘Export to Microsoft Excel’ to allow for this).
It’s obvious that a lot of effort is wasted in putting the business logic formulae and desired presentation into an Excel mock-up that is eventually thrown away. It’s also obvious that this reworking of the original intents is an opportunity to introduce errors into the application.
In this article, I propose a change to this workflow: take a basic sample of an Excel workbook, containing data and formulae, and use it as a template for dynamic workbooks where the data may be retrieved from a database. No new logic will need to be added to perform the business analytics. Instead, the logic that is already present in the template workbook is exploited.
The workbook in Figure 1 is composed with Excel and saved out as an XML file called default.xml. For the sake of this article, you’ll retrieve the information for the fields within the balance sheet from a flat XML file called input.xml.
|Figure 1: An Excel Workbook containing desired presentation and business analytics.|
It is very straightforward to change the application to read the data from your specific back-end data source either by URL or via a data-access technology.
Save the default.xml and input.xml files from the download to your “C:” directory. If you want to use a different directory, take note of it, and change the references in the source code as necessary.
Using the New Project wizard within Visual Studio.Net, select ‘Visual J# Projects’ and choose ‘ASP.NET Web Service’. Set the location to be ‘http://localhost/OfficeDoc’ and choose ‘OK’.
Replace the wizard-generated code for OfficeDoc.asmx.jsl with the code from the download.
What Does the Web Service Do?
The Web service code in the download produces a very simple output, as you can see in the mock up shown in Figure 1.
Using an input parameter (which is the ticker of the company for whom you are producing the balance sheet), the code builds the location of the balance sheet data within the XML file containing the balance sheet information as an XPath.
The XPath below will return the revenue information for IBM as a nodelist.
The code then parses this nodelist, and puts the information into the XML document describing the spreadsheet at the proper cell locations.
Opening the default.xml file from the download using Internet Explorer, you will see a document much like that in Figure 2.
|Figure 2: The XML Breakdown of an Excel workbook.|
This XML file is broken down into several sections. The ‘Workbook’ node contains the following nodes:
Expand the ‘Worksheet’ node to see the following nodes:
Expand the ‘Table’ node to see the following nodes:
If you are familiar with Excel, you’ll be in familiar territory. Columns and rows are the heart of the data storage structures within Excel.
Expanding the first ‘Row’ node presents a series of cells, containing the headings for the columns of data.
As expected, expanding the second Row shows the ‘Revenue’ line from the mockup. The first cell contains the label, the second contains the formula for calculating the average % change, the third is empty, and so on.
Rebuilding the Excel Workbook from Our Data Source
Take a look at the Web service code to find out how to manipulate the XML with the raw data. In Listing 1, you see how to loop through the three rows of data and then loop through columns 2, 4, 6, and 8 to place the data into the appropriate cells. Previously, the values for Revenue, Costs, and Taxes had been loaded into string arrays containing the appropriate values. Their values are placed into the cells by using the set_InnerText() method on an XML node in J#.
This code works on a node in the XML document, pulled out of the main document using an XPath to determine its location.
The XPath is simply the address of the nodes in the workbook containing the worksheet data as indicated below:
The Web service has now created an XML document. Returning to this document directly, you find that the browser sees only an XML document rendered with the default stylesheet (see Figure 2). To render this document as a spreadsheet, simply manipulate the MIME type of the returned content. This effectively fools the browser into thinking it’s being sent a workbook, when in fact you are sending it an XML document.
Here’s the code:
HttpContext.get_Current().get_Response().Clear();HttpContext.get_Current().get_Response()._ set_ContentType("application/vnd.ms-excel");HttpContext.get_Current().get_Response()._ Write(xmlOutputDoc.get_InnerXml());
Establish the response stream using the HttpContext.get_Current() call within j# and manipulate it with the three lines of code above. The first line clears the response stream, the second sets the MIME type to that of an Excel document, and the third outputs the contents of the newly manipulated XML document. The Excel document is now rendered in the browser space.
In some cases, the user will get a warning about opening the file. To override this, open explorer and open the tools->folder options dialog. From this dialog, under the ‘File Types’ tab, select XLS as the file extension, click ‘Advanced,’ and uncheck the ‘Confirm Open after Download’ check box.
Using Other Office Documents
Presently, Excel is the only Office application to have a pure XML implementation of its base document type. However, Word and Powerpoint have HTML representations containing inline XML of the document contents. It’s possible for you to manipulate these documents by loading the HTML of your template document into a string buffer and then use search and replace to put your data into the correct locations.
The MIME types for these applications are:
XML’s Usefulness Is Growing
As the industry adoption of XML grows, companies like Microsoft are conceiving innovative new ways to use it. There is a clear opportunity here to take advantage of the flexibility XML provides. Besides solving data interchange problems, XML can help streamline the development process, making us more efficient as developers. This article presents just one of the countless possibilities.