Browse DevX
Sign up for e-mail newsletters from DevX


Designing Smart Documents in Office 2003

Today, most organizations have a wealth of Office documents that contain critical information, but finding, extracting, and reusing that information programmatically remains a largely unrealized goal. Fortunately, that's changing as XML processing in Microsoft Office 2003 grows up.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

uplication of effort in gathering information is an increasingly familiar scenario in many companies today. For example, an employee complains, "I've already submitted my expense report to finance, and now you want me to re-enter it in this intranet portal app?" In this scenario the employee must enter expense information for approval from his/her manager and also file another separate expense report for settlement via finance because many finance departments use financial software packages that aren't integrated with the company's intranet portal. Such isolated duplicative systems give rise to the concept of "scattered data islands," many of which are never repurposed. You may dismiss this case as a simple integration problem, but consider how many times you've come across similar situations. For example, you may have submitted a well-documented list of components created in your previous projects to the company's intranet knowledge base portal. Your current project could reuse one of those components, but querying the document management system/portal may not find your previous documentation. Even if it does, you may need to perform a manual search to find the required information within the document. These conditions arise because the products used for development lack an integrated solution framework. Microsoft SharePoint portals addressed some of these issues, but the biggest issue is that companies have lacked a means to create intelligent repurposable documents.

Office 2003 can help solve such problems. Office 2003 supports an XML representation of content, so you can treat an entire Word document as a well-formed XML document. XML alleviates the problem of dealing with proprietary formats by letting you author templates based on XML schema (XSD). The XML content can then be filled in through automation, from databases, or Web services, or through data entry directly by users. In either case, the goal is that you can subsequently use the predefined schema to access the content of those documents, searching, altering, or retrieving any defined content within them using standard XML processing techniques. Microsoft's "Smart Document" concept provides considerable flexibility to achieve this goal. It makes documents context-sensitive based on the schema definitions, letting you create rich, client interactive applications, and produces output in a standard XML form that you can repurpose as needed.

The idea of context-sensitive documents isn't new, but until now, it's been common only in well-defined narrow applications. Office extends the potential for context sensitivity to any document based on a schema. Here's an example. Imagine that you've created an element in the document. When editing, a user points to that node and immediately the Office API identifies your tag definition and responds to it appropriately, perhaps providing a pick list or checking to make sure the entered information is valid. The process of defining the element, hooking it up to the Office API and providing a custom response when a user selects the tag is all programmable. This programmability leads to highly interactive and "intelligent" documents. For example, you can easily restrict which parts of a document a given user can change. Creating smart documents based on schema opens up a completely new arena of programming.

You can build and deploy such solutions in two ways, via the classic COM route or by using managed .NET code. If you choose the managed code option you can use the Visual Studio.NET Tools for Office 2003, which provides core template-based project solutions. Alternatively, you can build managed DLLs that use COM Interop to communicate with the Office applications. Either way, the bad news is that Office 2003 doesn't support any managed SDK. That's a big disadvantage; however, in most cases the advantages of using managed code outweighs the disadvantages of having to distribute the .NET framework and the performance penalties of COM Interop.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date