When installing the Microsoft Office system, the first thing you'll probably notice, after the ribbon and task panes, are the file names. The old, familiar three-letter extensions suddenly have an "x" after them. But developers, long used to changing file formats, will find something much greater than a new binary. With Open XML, you'll find the seed for an entirely new architecture, opening up not only the file format itself but also opportunities for a new breed of application.
With the Office Open XML file format standard, ratified by Ecma in December 2006, developers have the ability to build apps that can read and write Office documents without instantiating Office itself. This hints at the potential for building both standalone products that integrate with Office, as well as more deeply integrated enterprise applications based on the Office Business Applications (OBA) architecture.
What Is Open XML?
The "Ecma Office Open XML File Format," or just Open XML is an open, international standard for describing productivity documents, such as spreadsheets, word processing documents, and presentations. Open XML files package XML documents into a zip container, which the Microsoft Office system saves with extensions like .docx, .xlsx, and .pptx for Office Word, Office Excel, and Office PowerPoint, respectively. This container includes the content itself, a description of the formatting, and meta data, all in XML format, as well as any binaries embedded in the document, such as graphics, video, or audio.
Microsoft has further extended the Open XML format to provide for custom functionality via macros and add-ins by introducing additional file formats, such as .docm, .xlsm, and .pptm. These formats offer an extra level of protection against malicious or inept code by letting users know for sure that a DOCX, say, doesn't have any macros and Office won't even open it if it does contain macros..
Though Open XML came into play with 2007 Microsoft® Office system, this isn't the debut of XML in relation to Office. It actually started way back with Office 2000, when Microsoft introduced XML document properties. In Office XP, documents themselves began to taste the flavor of XML with the Office Excel spreadsheet XML format. That led to SpreadsheetML and WordProcessingML in Office Excel 2002 and Office Word 2003, respectively, setting the stage for full-on Open XML-based formats in the 2007 Microsoft Office system. Now the three flagship products of the Office suite—Office Word, Office Excel, and Office PowerPoint—all use the Open XML standard as their default file format.
To summarize, Open XML = Office Open XML Format = Ecma Office Open XML: international open standard with 89 total schemas; file format specification for electronic documents; zip file containing Office markup data in XML format based on the Open Packaging Convention (OPC), which is a new architecture that allows for things like embedded custom XML parts.
What It Looks Like
Assuming you have not yet upgraded and just want to take a look at a sample document, check out Figure 1. This was created using the Office 2007 Trial. As you can see, there's not much to this sample document. It uses Office Word 2007's default style set, with some basic formatting.
 | |
| Figure 1. Sample Office Word 2007 Document |
Here's how the Open XML standard works: describe the document using multiple XML files, then zip them all together. That's pretty much it. In a way, Microsoft Office applications have gone from being document creators to WYSIWYG XML editors for a specific kind of XML file (though it also recognizes legacy Office and other file formats as well). To prove this, rename the document in File Explorer and see for yourself.
The document in figure one began life as "SampleOpenXMLDocument.docx". By changing the filename extension from .docx to .zip you can open it like a compressed folder, revealing the contents shown in Figure 2 (here I've also opened the "word" folder within the zip file).
 | |
| Figure 1. Contents of docx as .zip File |
To go even further, take a look at the contents of some of these files. For example, Figure 3 shows the document.xml file:
 | |
| Figure 1. SampleOpen XMLDocument.zip\word\document.xml |