RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Generating Microsoft Office Documents with the Open XML SDK : Page 4

The Open XML SDK provides a comprehensive set of classes that make generating and manipulating Microsoft Office documents much simpler and faster than was possible with older Office file formats.


Template-Driven Document Generation using Word Content Controls

The example in the last section read the entire contents of a short document template into memory, and then performed a search and replace operation. That's fine for small templates, but when you have large multi-page templates, that approach will create memory and performance issues. Instead, you can use Content Controls, which help create templates, support structured editing, and also provide placeholders for various kinds of content in documents.

The primary content controls available are:

  • Plain Text
  • Rich Text
  • Picture
  • Calendar
  • Combo Box
  • Drop-Down List

Apart from the intrinsic benefits that structured documents and content-type restrictions offer, you also benefit from the way OOXML stores data rendered in content controls.

OOXML stores content control data in a custom XML file in the document package. Individual controls are mapped to elements in the custom XML file. When you open such a document, it late-binds to the content control data in the file. While the document is open, any changes you make to content in the controls gets reflected in the XML data—and vice-versa.

The fact that content control data is stored separately and mapped to controls at runtime makes it a good candidate for generating template-based documents.

This example covers three main topics:

  • Creating a template based on content controls
  • Using the Word 2007 Content Control Toolkit to map controls to custom XML elements
  • Updating the custom XML data programmatically, and generating documents based on the template

The next sections explain each topic in more detail.

Creating a Template

Open Word, create a new document, and switch to the Developer tab on the ribbon.

Author's Note: If the Developer tab is not visible (it's not by default), you can enable it by opening Word Options. To do that, click the Office button at the top left of your Word window and click the Word Options button at the bottom. In the "Popular" group, click the "Show Developer tab in the ribbon" option. Close the Word Options dialog, and the Developer tab will appear.

The Developer ribbon has a button group called "Controls" that let you insert various kinds of controls into the document. This example uses the same template as the "Search and Replace"section earlier in this article. This time, however, you'll create it using content controls. Again, the template example contains the text "The current version of [sdk] is [VersionNumber]."

Add two plain text controls for the SDK name and version number. After adding the controls, the template will look similar to Figure 3, depending on the control names you provided.

Figure 3. Template with Content Controls: In Word, the sample document containing the content controls should look similar to this.

Word 2007 Content Control Toolkit

The Word 2007 Content Control Toolkit provides a visual interface that helps when mapping custom XML elements to content controls—a process much easier than writing XPath queries. Download the Word 2007 Content Control Toolkit from CodePlex and install it, then start the tool and open the document template you created in the preceding section.

In the "Content Controls" pane on the left, you will see the details of the two controls in the template, including their names and types. In the "Custom XML parts" pane on the right, click on the link "Click here to create a new one" to create a new custom XML file. Switch to Edit View and add two elements that will store data for the two controls. The element names do not have to match the control names. For example, my XML file looks like this:


Switch to "Bind View" and drag the elements you created to the left pane and drop them on the controls. The drag/drop process establishes the bindings. After you've established the bindings, the left pane will look similar to Figure 4.

Figure 4. Bound Content Controls: Here's how the Content Controls pane in the Word 2007 Content Control Toolkit looks after binding the SDKName and VersionNumber controls to specific XML elements.

Save the template, and then inspect the document package by changing the extension from docx to zip. You will find a new customXml folder containing your custom XML file with the data. If you change the contents of this XML file and then reopen the document (remembering to change the zip extension back to docx), you will find that the content controls now display the updated content. Similarly, if you change the control content in Word, save the file, rename it, and re-inspect the custom XML file, you'll see that the changes have been persisted there.

With the template and bindings in place, you now have the opportunity to generate a large number of documents based on the template.

Update Custom XML Data Programmatically

Open the package using the WordprocessingDocument class's Open method:

WordprocessingDocument wordDoc = 
   WordprocessingDocument.Open(fileName, true);

Next, create the custom XML file containing the data for this document, and store it in the package.

Store the custom XML in memory and add placeholders for the actual data. For this sample the custom XML with placeholders is a string containing:

<root> <SdkName>!Name!</SdkName> 
   <Version>!Version!</Version> </root>

You'll replace the !Name! and !Version! placeholders with actual data for each document. This example uses the Regex utility, but you can use any code or technology you like to create your custom XML. When your custom XML is ready, replace the existing custom XML with the new one by deleting the existing one and adding the new one using the following code:

MainDocumentPart mainPart = wordDoc.MainDocumentPart;
CustomXmlPart customXmlPart = mainPart.AddNewPart<CustomXmlPart>();
StreamWriter ts = new StreamWriter(customXmlPart.GetStream());

You can now save this document under a different name, and repeat the process as needed, using different data for each document, giving you a fast way to create large numbers of custom documents.

The main difference between this and the earlier Search and Replace approach is that this technique focuses only on the dynamic data, while the other approach required parsing the entire document.

This approach is also much more efficient than Mail Merge functionality available which is used for creating large number of small documents based on template and data store.

This article used only a small fraction of the OpenXML SDK, but if you've ever tried to create or manipulate Word files programmatically using earlier technologies, you can probably already tell that this is a much more robust and simpler way. In addition to the scenarios shown here, the OpenXML SDK also lets you operate on comments and tracked changes stored in Word documents. In addition, the SDK contains APIs that operate on Microsoft Excel and PowerPoint documents.

Vikas Goyal is a Microsoft MVP solutions architect with several years of industry experience. He is mainly involved in designing products and solutions for the financial industry. You can contact him through his web site or his blog.
Email AuthorEmail Author
Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date