|
||||||
Cool Thing #1: The New Files are Open and XML
The new file formats are a logical progression of the XML work done in the most recent versions of Office. The last three versions have incrementally increased the XML capabilities of Office applications to a point that, today, it is possible to generate Office documents through their respective XML specifications (i.e. WordProcessingML and SpreadSheetML) without manipulating the Word and Excel object models. Given the ubiquity of XML and the XML features already included in Office 2003, the new default file formats should be viewed as a positive step forward because it places XML front and center in Office
ArchitectureIt's a ZIP file! The File Package
Parts Relationships
This document defines nine relationships for Document.xml. For each relationship, the document defines both the content type and location of the related file. These relationship definitions are very important as part names do not persist across saves. ThusW, when Word opens a document package it uses the relationships defined in the package to locate the document parts and build out the document in the Word UI. Cool Thing #1 Key Takeaway: The Office Open XML file formats free developers from the tyranny of the binary file format. The new files are now wide-open and accessible from any application without manipulating the Office application word models. If you are tempted to say, "Big deal! I can do whatever I want with Office files already!", keep reading. To Learn Even More: Download (and read!) the Microsoft Office Open XML Formats Guide here. Cool Thing #2: The New Formats Make Anything Possible
Well they may not bring about world peace, but the new formats sure make developing solutions that include Office files a lot less stressful. No longer does a developer have to go through the Office applications just to make wholesale changes to one or more documents. Instead of needing to know the intricacies of an Office application's object model, a developer only needs to know how to peruse the contents of a ZIP file (the document package) to find and edit the desired document parts (the various xml files included in the package). If you think about it for a few minutes it doesn't take much brainpower to think of some interesting new Office-related developer scenarios.
Here are a few application ideas I thought of while researching this article:
Each of these ideas is made possible (or made simpler) by the new, open file formats. Cool Thing #2 Key Takeaway: The new XML file formats create new possibilities when developing solutions for your clients by freeing Office files from their binary format, as well as their host application's object model. In addition, Microsoft has submitted the new file formats to ECMA for standardization. Once accepted, the ECMA standardization will encourage wide-spread implementation and adoption of the file formats by non-Microsoft applications. The end-goal here is to increase Office document interoperability across platforms and back-office applications. To Learn Even More: Download (and read!) the Microsoft Office Open XML Formats Architecture Guide here. Take a look at the ECMA proposal here. The initial draft of the proposal can be downloaded here. Cool Thing #3: You Can add Your Own Content to the Document Package
Like all other Microsoft technologies, the document packages are architected to provide for extensibility. It is quite possible to include custom XML data inside the package and then reference the custom data inside the document package. The custom data will reside in a special folder named dataStore which resides in the package's word folder (or excel for Excel, ppt for PowerPoint). This means it is still possible to attach XML schemas to documents and include dynamic XML data. This also means that at any time the custom XML data is updated, those changes will be reflected in the document.
Cool Thing #3 Key Takeaway: The new file formats are extensible and can include custom data for use within the document itself. To Learn Even More: Read Brian Jones's blog, especially this post. Cool Thing #4: The System.IO.Packaging Namespace
If you haven't been wondering how to manipulate the new file formats with code yet, you were bound to wonder soon. The answer is that the System.IO.Packaging namespace contains all the classes you need to code against the new formats. The new Office file formats are based on the Open Packaging Conventions (as is the XML Paper Format) which will be released with Windows Vista and is part of WinFX .Windows Presentation Foundation.
Although you could use any tool that has the capability to manipulate ZIP files, you don't need them as the System.IO.Packaging namespace is designed for this purposesmaking manipulating the document package as simple as opening a ZIP file, querying the relationships for the desired content types, and then adding, editing, or deleting files. The key objects (or at least for the sample discussed next) are defined in Table 1. Cool Thing #4 Key Takeaway: The new file formats even come with a specialized set of classes contained in the System.IO.Packaging namespace. Theses classes provide a developer with the ability to create, modify, and delete document packages. To Learn even more: Download the WinFX SDK here. Also read Kevin Boske's blog (not many posts right now but I am sure more will follow). Kevin is the Office Programmability Program Manager so his blog is worth subscribing to. Cool Thing #5: The New File Formats Are Easy to Build Upon (Sample Application Overview)
Once you get the hang of it, the new file formats are really simple to develop against (with one minor issuesee Dev Tip below). To demonstrate, I created a sample Web page that allows a user to pick a document and then select between several options for headers and footers (see Figure 2). Once the user makes their selections, they can press the "Build Document" button causing the Web page to open the selected document and insert the desired header and footer.
Dev Tip: For some reason, the reference System.IO.Packaging does not automatically display in the Add References dialog box. This is no cause for concern, however, as this namespace is found inside the Windowsbase.dll assembly and is easily added to your project. After you install the January version of the WinFX SDK, windowsbase.dll can be found in the C:\WINDOWS\Microsoft.NET\Windows\v6.0.5070 folder.
Listing 1 contains the complete listing of the code behind the page. The application logic follows this flow:
When you run the code, the original document quickly updates with my choices for header and footer (see Figure 3). Cool Thing #5 Key Takeaway: Manipulating the new file formats in code is a simple process that does not require a lot of effort. To Learn even more: Look for more content on the Devx Microsoft Office Professional Developer Portal, as well as MSDN. Office 2007's new Open XML file format provides extensive capabilities to a developer. As this article has pointed out, there is a lot to be excited about and to learn. |
||||||
|
Ty Anderson runs Cogent Company, a consultancy in Dallas specializing in leveraging technology to enable business strategy. Ty is a regular contributor to the Microsoft Developer Network (MSDN) and has recently written a book, Office Programming 2003: Real World Applications, focused entirely on building applications with the Microsoft Office System 2003.
|