devxlogo

Designing Smart Documents in Office 2003

Designing Smart Documents in Office 2003

uplication of effort in gathering information is an increasingly familiar scenario in many companies today. For example, an employee complains, “I’ve already submitted my expense report to finance, and now you want me to re-enter it in this intranet portal app?” In this scenario the employee must enter expense information for approval from his/her manager and also file another separate expense report for settlement via finance because many finance departments use financial software packages that aren’t integrated with the company’s intranet portal. Such isolated duplicative systems give rise to the concept of “scattered data islands,” many of which are never repurposed.

You may dismiss this case as a simple integration problem, but consider how many times you’ve come across similar situations. For example, you may have submitted a well-documented list of components created in your previous projects to the company’s intranet knowledge base portal. Your current project could reuse one of those components, but querying the document management system/portal may not find your previous documentation. Even if it does, you may need to perform a manual search to find the required information within the document. These conditions arise because the products used for development lack an integrated solution framework. Microsoft SharePoint portals addressed some of these issues, but the biggest issue is that companies have lacked a means to create intelligent repurposable documents.

Office 2003 can help solve such problems. Office 2003 supports an XML representation of content, so you can treat an entire Word document as a well-formed XML document. XML alleviates the problem of dealing with proprietary formats by letting you author templates based on XML schema (XSD). The XML content can then be filled in through automation, from databases, or Web services, or through data entry directly by users. In either case, the goal is that you can subsequently use the predefined schema to access the content of those documents, searching, altering, or retrieving any defined content within them using standard XML processing techniques.

Microsoft’s “Smart Document” concept provides considerable flexibility to achieve this goal. It makes documents context-sensitive based on the schema definitions, letting you create rich, client interactive applications, and produces output in a standard XML form that you can repurpose as needed.

The idea of context-sensitive documents isn’t new, but until now, it’s been common only in well-defined narrow applications. Office extends the potential for context sensitivity to any document based on a schema.

Here’s an example. Imagine that you’ve created an element in the document. When editing, a user points to that node and immediately the Office API identifies your tag definition and responds to it appropriately, perhaps providing a pick list or checking to make sure the entered information is valid. The process of defining the element, hooking it up to the Office API and providing a custom response when a user selects the tag is all programmable. This programmability leads to highly interactive and “intelligent” documents. For example, you can easily restrict which parts of a document a given user can change. Creating smart documents based on schema opens up a completely new arena of programming.

You can build and deploy such solutions in two ways, via the classic COM route or by using managed .NET code. If you choose the managed code option you can use the Visual Studio.NET Tools for Office 2003, which provides core template-based project solutions. Alternatively, you can build managed DLLs that use COM Interop to communicate with the Office applications. Either way, the bad news is that Office 2003 doesn’t support any managed SDK. That’s a big disadvantage; however, in most cases the advantages of using managed code outweighs the disadvantages of having to distribute the .NET framework and the performance penalties of COM Interop.

Getting Started
For this article, the goal is to create a simple customer information template that contains basic customer information and a particular item that customer has purchased. You’ll see more about the exact entities required by this document later, but you can look at the downloadable Customer Information.dot template file (see Figure 1) to get a good idea of the overall content. To complete the application as described here, you’ll need VB 6.0, Visual Studio.NET 2003, and Office 2003 installed on your development machine. The PIA (primary interop assemblies) for Office 2003 ship with the product, and the full installation installs them, but you do need to explicitly select the PIA if you perform a custom Office installation. Finally, I suggest you install the Office SmartDocument SDK, because that provides some useful tools and important schema files.

Note: If you have a pre-beta installation of VS.NET 2005 (Whidbey) and VS.NET 2003 you may have problems running Office applications in debug mode, I was able to eliminate these problems by uninstalling the pre-beta installation.

Here are the basic steps involved in creating a SmartDocument application:

  • Create a document template that users will fill out to submit information.
  • Create an XSD schema that logically relates to the content (entities) in the document template that you intend to capture.
  • Apply the XSD schema to the document.
  • Create shims which implement the ISmartDocument interface and hookup the shim implementation to call managed code. Shims in this case are unmanaged DLL’s that are restricted to loading assemblies built in managed code.
  • Implement the ISmartDocument interface in your managed code. The managed code encapsulates all the business logic for the application.
  • Create an XML manifest for downloading the application-dependent files. You could also create MSI packages, which could contain XML manifest files for deployment.

Much of this work has been done for you in the downloadable sample code. Extract the download contents into a folder, and follow these steps:

  1. The folder path is critical for this example because the manifest path points to the various DLLs and Schema based on the file path. Make sure you change the appropriate path in the CustomerManifest.xml file so it matches the path where you extracted the sample code.
  2. Find the Data.xml file (which the sample uses instead of a database), and note its location. You’ll need to alter the hard-coded path for that document in the GetData() method of the DocumentUtils class.
  3. Find the DocumentShim.dll and register it using regsvr32.
  4. Finally, compile the .NET solution with the “Register for COM interop” option on. You can locate this option as one of the build properties by browsing through the project property page in VS.NET.
  5. If you installed the SmartDocument SDK, locate and run “Disable XML Expansion Pack Manifest Security” from the program’s menu option. You should be able to locate this reg file in the Tools menu group.
  6. Open the Customer Information.dot template in Word, go to “Tools, Templates and Add-ins,” and click on the XMLExpansion packs option. If you see any available or attached XML expansion packs, delete them and then add them back from the extracted folder.

After making these configuration alterations you’re ready to follow along with the rest of this article.

Author’s Note: The XML Expansion pack requires a trusted certificate; to avoid this you run the DisableManifestSecurity reg file. Also, make sure that your Word macro security setting is not set to High.

Creating the Document Template
Open Word and insert two tables as shown in the enclosed sample “Customer Information.dot” document (see Figure 1). The first table will hold customer details, while the second one holds a purchased item. Add the text information inside the table as shown, and save the document as a Word template (.dot) file. You won’t see the XML nodes shown in Figure 1 until after you apply the schema.

?
Figure 1. Template View: After applying a schema, the Word template contains two tables and displays the XML tags from the schema.

Creating the XML Schema
You need to define an XML schema containing the elements and attributes that represent the information required by the document template. The Customer.xsd schema (see Listing 1) fulfills that purpose for this sample application. It’s often useful to use attributes for the data you intend to repurpose and elements to display data in document. For example, suppose you want to capture an element with the value “United States Of America” from a database. You fetch the identifier (the primary key value) value for “United States Of America” from the database and assign it to an attribute node while displaying “United States Of America” as element text. This provides users with the flexibility to change text whenever appropriate?for example, the user might override the text to display “USA”; but cannot change the attribute value that identifies the row in the database, letting you repurposed the information easily later. Associating such primary keys with display text in intelligent documents is important, as you want to provide users with maximum flexibility to change the content as needed without impacting the validity of the business process.

Applying Schema
The schema definition in Listing 1 should give you an idea of how to design custom markup in XML format. You need to apply the schema defined in the XSD to the document. You can add schema and UI structure dynamically (via programming) using bookmark identifiers or you can pre-define them while creating the template. For now, take the pre-defined approach, which you can accomplish using only Word.

  1. Open the document template, click the dropdown menu at the top of the Task pane and select “XML Structure.”
  2. Click the Tools menu and select “Templates and Add-Ins”?the Options tab pops up with default as “XML Schema.”
  3. Click “Add schema” and select the Customer.xsd schema. By clicking the XML Options button, you can see the various XML Options settings available for the document. For now, leave the default settings and click OK.
  4. In the XML Structure task pane you’ll see a documentRoot element. Click that element in the task pane. You get an information box asking you whether to apply the element for the entire document or just the selection. Click on “apply to entire document”. Word will add a pink DocumentElement tag surrounding the document content.
  5. In the XML Structure task pane’s “elements in the document” section you’ll see a yellow question mark indicator which informs you that something is missing in the document and the schema is not valid. The next element to select is displayed as section in the “Choose an element to apply” area at the bottom of the task pane.
  6. Click the documentElement tag, which will select the content (the two tables) and then click the section element in the task pane. Word applies that tag with a different indicator in the XML Structure task panes “documentRoot” element. Right click the element and you’ll see a message stating that the content for that element is incomplete. Click the “Attributes” option and add the values shown in the enclosed document template.
  7. Select the first (Customer) table and then click the customer element in the task pane to apply that to the table. Continue by clicking in the appropriate table fields to apply the customername, and address elements, selecting each child element from the task pane. Repeat these actions to define the elements for the Item table. Note that Word does validate, but it doesn’t stop you from creating invalid schema, especially when you’re working manually against an unprotected document.
Author’s Note: You don’t have to apply the region element and attributes right now because you’ll see an example of adding node attributes and values dynamically later in this article.

Creating a Shim
You’ve done the configuration work you need to get started programming. Now you need to make the document intelligent. If you were wondering why I mentioned you need VB 6.0 installed, it’s because you can create VB 6.0-based wrapper classes (not PIA) that will make calls to managed code. Though these are not required to program Office 2003 applications, to get started easily we use shims. If you specifically want to avoid using shims, refer to this article on MSDN for more information. Doing this will require configuration changes with your manifest and altering CAS settings for the managed code. Assuming you opt for unmanaged shims as described here (called “Action Handlers” in SmartDocument terminology), the approach is as follows:

  1. Create a new VB6 ActiveX DLL project and name it as DocumentShim. Rename the default class to SmartDocShim.cls
  2. Add a reference to the Microsoft Smart Tags 2.0 Type library and implement the ISmartDocument interface.
  3. Each method/property of this interface is significant as discussed later. For now, create an instance of ISmartDocument type in the SmartDocInitialize event, for example:
   Private ObjSmartdocument as ISmartDocument   Private Sub ISmartDocument_SmartDocInitialize(...)      Set objSmartdocument = CreateObject( _         "Office.Samples.SmartDocument.Customer")   ...   End Sub

The class ID in the preceding code refers to a managed code Customer class DLL that you will create. This VB6 DLL simply acts as a hookup to the managed code.

  1. You implement all method/property calls and route the calls to the managed code instance defined within the SmartDocInitialize method. The class file SmartDocShim.cls in the DocumentShim.vbp project is just a direct hookup of the initialized document. You can find a thorough explanation of the reasoning behind this method of implementing the definitions in this MSDN article.
Author’s Note: The shim needs to interoperate with managed code arrays in few methods, which is why you will see a ResetArrayList method in SmartDocShim.cls.

devxblackblue

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist