Destination .NET! Platform Tools, Technologies & Resources
Get SQL Server 2008 support. www.innovateon.com

What are your future plans around SQL Server?

(Choose your top answer.)
I'm on SQL Server 2000 and will be migrating to 2008
I'm on SQL Server 2005 and will be migrating to 2008
I'm on SQL Server 2005 and will not be migrating to 2008
I'm on a version of SQL Server and will be migrating to a non-Microsoft database
I'm too confused to know what to do at this point

View Results
How to Build a .NET Solution or Project from a Command Prompt
Convert Hashtable Keys or Values into an ArrayList
Explore C# 4s New Dynamic Types and Named/Optional Parameters
Create a Syslog Sender/Receiver Using the MS Winsock Control
Generating Microsoft Office Documents with the Open XML SDK
 Print Print
Rate this item | 0 users have rated this item.

X Marks the .doc—An Overview of the Office Open XML File Format

2007 Microsoft® Office system introduces a new XML-based file format called Office Open XML as its default file type. An open, royalty-free standard, Open XML gives developers a wide range of new opportunities, primarily the ability to create and manipulate Office files without using Office. We'll answer the basic questions, and then dive into what Open XML looks like, how it works, and how you can code against it. 


When installing the Microsoft Office system, the first thing you'll probably notice, after the ribbon and task panes, are the file names. The old, familiar three-letter extensions suddenly have an "x" after them. But developers, long used to changing file formats, will find something much greater than a new binary. With Open XML, you'll find the seed for an entirely new architecture, opening up not only the file format itself but also opportunities for a new breed of application.

With the Office Open XML file format standard, ratified by Ecma in December 2006, developers have the ability to build apps that can read and write Office documents without instantiating Office itself. This hints at the potential for building both standalone products that integrate with Office, as well as more deeply integrated enterprise applications based on the Office Business Applications (OBA) architecture.

What Is Open XML?
The "Ecma Office Open XML File Format," or just Open XML is an open, international standard for describing productivity documents, such as spreadsheets, word processing documents, and presentations. Open XML files package XML documents into a zip container, which the Microsoft Office system saves with extensions like .docx, .xlsx, and .pptx for Office Word, Office Excel, and Office PowerPoint, respectively. This container includes the content itself, a description of the formatting, and meta data, all in XML format, as well as any binaries embedded in the document, such as graphics, video, or audio.

Microsoft has further extended the Open XML format to provide for custom functionality via macros and add-ins by introducing additional file formats, such as .docm, .xlsm, and .pptm. These formats offer an extra level of protection against malicious or inept code by letting users know for sure that a DOCX, say, doesn't have any macros and Office won't even open it if it does contain macros..

Though Open XML came into play with 2007 Microsoft® Office system, this isn't the debut of XML in relation to Office. It actually started way back with Office 2000, when Microsoft introduced XML document properties. In Office XP, documents themselves began to taste the flavor of XML with the Office Excel spreadsheet XML format. That led to SpreadsheetML and WordProcessingML in Office Excel 2002 and Office Word 2003, respectively, setting the stage for full-on Open XML-based formats in the 2007 Microsoft Office system. Now the three flagship products of the Office suite—Office Word, Office Excel, and Office PowerPoint—all use the Open XML standard as their default file format.

To summarize, Open XML = Office Open XML Format = Ecma Office Open XML: international open standard with 89 total schemas; file format specification for electronic documents; zip file containing Office markup data in XML format based on the Open Packaging Convention (OPC), which is a new architecture that allows for things like embedded custom XML parts.

What It Looks Like
Assuming you have not yet upgraded and just want to take a look at a sample document, check out Figure 1. This was created using the Office 2007 Trial. As you can see, there's not much to this sample document. It uses Office Word 2007's default style set, with some basic formatting.

Figure 1. Sample Office Word 2007 Document

Here's how the Open XML standard works: describe the document using multiple XML files, then zip them all together. That's pretty much it. In a way, Microsoft Office applications have gone from being document creators to WYSIWYG XML editors for a specific kind of XML file (though it also recognizes legacy Office and other file formats as well). To prove this, rename the document in File Explorer and see for yourself.

The document in figure one began life as "SampleOpenXMLDocument.docx". By changing the filename extension from .docx to .zip you can open it like a compressed folder, revealing the contents shown in Figure 2 (here I've also opened the "word" folder within the zip file).

Figure 1. Contents of docx as .zip File

To go even further, take a look at the contents of some of these files. For example, Figure 3 shows the document.xml file:

Figure 1. SampleOpen XMLDocument.zip\word\document.xml

  Next Page: Benefits of Open XML
Page 1: IntroductionPage 3: Open XML and OBA
Page 2: Benefits of Open XML 
Log in to rate this item.
Don't have a login? Get one now!
Submit article to: