his article walks you through a case study that illustrates how to marry SharePoint 2007 with Excel data to offer a Business Intelligence (BI) “lite” solution for avid spreadsheet users. The information and lessons learned in this article come from a real-world project, where a company needed to collect data from hundreds of Excel documents, as well as SQL databases and Word documents, and chose to use SharePoint 2007 to present and “slice and dice” this information.
The project’s success showed clearly that BI data doesn’t have to come from databases or Excel services; it can come from (arguably) the most popular management and reporting tool in use: Excel.
You’ll see some of the challenges involved in creating this project, and explore some options for importing Excel data into SharePoint 2007. The article also provides some sample VB code that can help you massage Excel data into a more SharePoint-friendly form.
The Business Challenge
Discussions with the client determined that their business-critical data (financial information used to manage investments in over 100 companies) was stored in two main areas: a SQL 2000 database that stored data for a financial front-end application, and hundreds of spreadsheets that came directly from the companies in which the client had invested (the “managed companies”).
We considered the option of using SharePoint 2007’s Business Data Catalog functionality and creating Application Definition files to connect to the SQL 2000 database. The main hurdle was that the SQL data was a little “suspect” and could not be counted on to be 100 percent accurate. However, as the front-end application could easily create Excel reports, qualified professionals could review the data manually, which served as a verification process. With that verification step in place, the data could be trusted, so the first design decision became: Use SQL 2000 data exported to verified Excel spreadsheets as the trusted source of information from the financial application.
After reviewing the Excel spreadsheets that were being used to collect financial data from the managed companies, we determined that one standard template could be modified to hold all the critical data that would be needed to populate the SharePoint 2007 sites. Therefore, the second design decision became: Use Excel spreadsheets as the trusted source of financial information about the managed companies.
So the decision was reached quite quickly that?by hook or by crook?the trusted data needed to come from two types of Excel 2003 spreadsheets per managed company. We would create a SharePoint 2007 site for each managed company (Company A, Company B, Company C, and so on) and then roll up the data to a central site for additional reporting purposes.
A List of Import Options
The next decision involved how to move data from the spreadsheets to SharePoint 2007 lists. In addition to Excel Services (see the sidebar “Excel Services“), which was ruled out immediately, we considered these possible options:
- Cut and paste data from the spreadsheets to SharePoint 2007 lists.
- Use SharePoint 2007’s ability to create a list based on the contents of the spreadsheet to import data from the spreadsheets to SharePoint 2007.
- Convert the data in the spreadsheet to a database table and then export to SharePoint 2007.
- Import the Excel worksheets to Access 2007 and then export the Access database to a SharePoint 2007 list.
- Find a third-party product that could import the data to SharePoint 2007.
As the list of options grew and we reviewed the spreadsheets’ content in more detail, a couple of things became clear. I’ll use some examples and screen captures to illustrate some of the limitations encountered.
Exploring Import Options 1-3
We ruled out Option 1 (the cut and paste option) immediately, because there were thousands of cells to import for each company. That led to the next design decision: Cutting and pasting to SharePoint 2007 lists is too labor intensive and prone to error.
Exploring Option 2, SharePoint 2007 can create a list based on the contents of a spreadsheet (by accessing the Create command and then choosing Import Spreadsheet from the Custom Lists section), but that has several limitations. First, it’s a one-time import so there’s no easy way to add additional rows at a later time, or update existing content when it changes in the spreadsheets (which are updated monthly). Second, SharePoint 2007 wants the source data to be organized in rows, but the typical financial spreadsheet is organized in columns. Figure 1 shows a typical spreadsheet, while Figure 2 shows the results of trying to import that spreadsheet into SharePoint 2007. That failure led to another design decision: The Import Spreadsheet function in SharePoint 2007 doesn’t work with standard accounting and financial spreadsheet layouts, and is a one-time process, so it doesn’t support future updates to the spreadsheets.
Fortunately, there’s a workaround to the import problem. If you simply copy the data in one spreadsheet onto one worksheet, and then use the Paste Special option to paste it to another worksheet, you can use the Transpose option, which switches column data to row data, as shown in Figure 3. You’ll still need to apply a few changes to make the data SharePoint friendly. You can see one such change in Figure 3, which shows the headings in row A moved to row B. If you don’t make this change, SharePoint assumes that the only headings are in row A and everything else is data. Figure 4 shows the new SharePoint list created by importing the massaged data. The resulting design decision was: Transposed and reformatted data from standard financially oriented spreadsheets (with dates as column headings) can be imported effectively into a SharePoint 2007 list.
In Figure 5, each row now corresponds to a date, and each column heading corresponds to a financial value, such as Actual Revenue or Budget/Plan Revenue. However, because the original spreadsheet used standard financial formatting, the list headings aren’t fully representative of the data they contain; instead, the headers have numbers added to give them unique values (Actual2, Budget/Plan3, Forecast4, Actual5, etc.). Additionally, what were the originally the section headers in the nicely formatted original spreadsheet have been imported as separate columns, containing no data.
|Figure 6. Imported Column Types: After importing a spreadsheet, the List Settings show the column types based on SharePoint’s interpretation of the data.
A quick look at the List Settings for the list (see Figure 6) shows that SharePoint 2007 made some incorrect assumptions about the imported data. For example, the import assigns both the dates and the Revenue Forecast information in the spreadsheet a column type of “Single line of text.” So after some experimentation, it became clear that the data would require quite a bit of formatting before it could be presented properly in SharePoint 2007 using the Import Spreadsheet method. For example, the blank cells in the spreadsheet should contain “0” to force SharePoint to treat those cells as numeric rather than text data, and the column headings should be made unique and more descriptive prior to import. The resulting design decision was: Significant formatting changes are required after the transposition of data to ensure that SharePoint creates a list with properly labeled and formatted columns.
The next issue with this process (using the Import Spreadsheet feature in SharePoint 2007) comes when new data is added to the source spreadsheet. There is no connection from the initial source spreadsheet to the new SharePoint list. A new list could be created every time the source data changes, but then the SharePoint administrator is faced with administrative headaches: are the old lists deleted or kept for reference, and how will users find the latest set of data? If the spreadsheet only changes once a year, this could be a viable option, but the client’s spreadsheets tended to change once a month, meaning that new data would periodically need to be manually cut and pasted from the new spreadsheet to the existing SharePoint list. Once again the issues of labor and possible human error come into play, which simplifies this next design decision: The Import Spreadsheet method is not a good option when the source spreadsheet changes frequently and you need to import the new or modified data to the same SharePoint list.
|Figure 7. Transposed and Formatted Data in Table Format: Results of selecting the Format as Table button in Excel 2007 (after installing an add-in from Microsoft).
Option 3, converting the data in the spreadsheet to a table and then exporting to SharePoint 2007, addresses one of the weaknesses in Option 2. By converting the range to a table and then exporting to SharePoint 2007, you can establish a connection between the spreadsheet and the SharePoint list. However, as you’ll see below, you need a special Excel 2007 download from Microsoft to make this work. In addition, you must save the file in Excel 2003 format to have synchronization functionality.
Microsoft publishes complete instructions on the process of converting a range to a table, but the process essentially involves selecting the range, and then clicking on the Format as Table button on the Home tab of the Ribbon. Figure 7 shows the sample set of data transposed, the headings from row 1 moved to row 2, and then converted to a table.
To now publish this data to SharePoint 2007 and enable it to synchronize when changes are made, you need to download an Excel 2007 add-in from Microsoft, which you’ll find in the article “Excel 2007 Add-in: Synchronizing Tables with SharePoint Lists.”
|Author’s Note: This article provides some insight into why Microsoft “deprecated” this functionality: Microsoft recommends that users instead use Access 2007 to write data to SharePoint lists and for using lists offline.
After you install and configure the download add-in in Excel Options, you’ll find a new button (“Publish and allow Sync”) on the Design tab on the Office Ribbon that lets you publish and sync with SharePoint 2007. Now you can publish a read-write list to SharePoint 2007, but only if the workbook is saved in Excel 2003 compatibility mode. If you save the document in Excel 2007 format, the publish is a one-time, non-synchronizable event. Helpful messages in Excel will alert you to the loss of functionality if you try to save in the “wrong” format.
|Figure 8. Imported List: Here’s how the newly formatted table looks when imported to SharePoint.
Now you can access the Design tab in Excel 2007 from your newly formatted table, and then click the Publish and allow Sync button. You’ll need to enter the SharePoint 2007 site name and the list name to begin the publishing process. Figure 8 shows the results of exporting this table to SharePoint 2007.
For usability’s sake, modify the resulting columns in SharePoint to have appropriate and complete names. For example, you could rename column 1 to Date and the Actual column to Revenue Actual.
|Figure 9. List with Renamed and Deleted Columns: After renaming and deleting unneeded columns, here’s the resulting list in SharePoint.
Finally, you can delete the empty Revenue column. Figure 9 shows the results of these changes in the SharePoint 2007 list.
To synchronize these changes back to Excel (saving in Excel 2003 format, don’t forget!), return to the spreadsheet, right click on a cell in the table, and select “Table, Synchronize with SharePoint.” The table in Excel will adjust correspondingly. You can also make changes in the Excel spreadsheet. One nice feature is that if unsynchronized changes exist in either the list or the spreadsheet, the next user who opens the spreadsheet will be given the option to “load the pending changes.” That became the next design decision: By installing the “Synchronizing Tables with SharePoint Lists” Excel add-in, and then saving the spreadsheets in Excel 2003 compatibility mode, you can establish a two-way synchronization with a SharePoint 2007 list.
Exploring Import Options 4-5
Option 4, importing the Excel worksheets to Access 2007 and then exporting the Access database to a SharePoint 2007 list turned out not to be a viable option, because it involved creating and managing yet another database. After that, you still need to reformat the Excel data to get it into a format suitable for SharePoint.
While Access does facilitate connectivity to SharePoint and provide a wealth of reporting features, that convenience didn’t merit the additional design and testing efforts. The design decision was: Importing Excel worksheets to Access 2007 and then exporting the Access database to a SharePoint 2007 list adds an unnecessary level of complication and management.
Regarding Option 5?finding a third-party product to import the data?a company named Bamboo Solutions offers a product called List Bulk Import that can migrate data and documents into SharePoint 2007 from legacy databases, spreadsheets, and from older versions of SharePoint. In other words, this product eliminates the need to use methods such as those already described.
The List Bulk Import product proved to operate as advertised, but like SharePoint 2007, it expects data to be arranged into rows rather than columns. In a positive light, however, the List Bulk Import product has a sophisticated interface for mapping values on the spreadsheet to the SharePoint list, and provides options that can check for duplicate values in the SharePoint list, update existing data, and create scheduled import jobs. You can save Import Templates in xml files for reuse or editing. The product also produces a log file that provides an audit trail so you can see whether imports succeed or fail. See the Bamboo Solutions web site for additional information. The final design decision was: Bamboo Solutions’ List Bulk Import offers some powerful features that make it a better solution than Options 1-4.
The Bamboo List Bulk Import does not create the SharePoint 2007 list; you need to create the list in advance. This actually proved to be advantageous. Creating the list manually helped ensure that column headings were fully descriptive and unique, and provided an opportunity to determine the format of the data, numeric precision, whether numbers were displayed as percents, and other formatting options. You can use content types in SharePoint 2007, storing them in the gallery for the site collection, which means new columns can be added or existing ones changed, and the changes pushed out to lists using the content type.
|Author’s Disclaimer: I worked with Bamboo Solutions to scope out the addition of a feature that allows users to define a range of cells that is transposed during the import process. They added this feature to their product (for a modest fee of course). So look for the cell transposition feature in upcoming versions of the List Bulk Import product.
Using Excel Macros to Prepare Your Data for Import into SharePoint
As discussed above, standard financial spreadsheets typically organize data in a way that is “unfriendly” to SharePoint lists (see Figures 1 and 2 for an example). Unless the organization is willing to change its financial spreadsheets (unlikely), you’ll need to manipulate the data to prepare it for importing. Fortunately, Excel includes a powerful macro engine that you can use to make such changes easily.
The following list shows some tasks that will be required, regardless of which import option you decide to use, and some examples of macros that show how you can automate such tasks:
- Transpose data from the standard row format to place data in columns. The upcoming version of the List Bulk Import tool will offer this functionality. If you’re not using that, you’ll need a macro to transpose data.
- After the transposition, you’ll probably want to move some column titles to place all headings in the same rows (see Listing 1).
- Financial spreadsheets often contain text in cells that should contain numbers. The classic example is the inclusion of “NA” in cells. SharePoint will not accept text when it expects the cell to contain a number so you’ll need to change text to numbers (see Listing 2)
As you can see, there are some tricks involved in pulling data from Excel spreadsheets into SharePoint 2007 lists. This article described several different methods, all of which were examined in the context of real-world requirements. Although we eventually decided to use a third party product, companies whose data is less complex, changes rarely, or who don’t require audit logs may opt for one of the simpler solutions described here.
|Author’s Note: Thanks to Ralph Hefner who provided the code listings for this article.