Source code has three audiences: the compiler, ourselves, and other readers. For more and more developers, the last is the most important. Of course it's important to properly specify to the compiler what it is you wish the computer to do, but in a global marketplace, a talent for programming is not enough. To stand out, you must be able to communicate to other people the reason for your work, its unique characteristics, the alternatives that you considered and rejected, and so forth. "Literate programming" is a term that captures this mandate to create software programs that are not just specifications, but essays on the task at hand.
The simple text files that are consumed by compilers are hardly satisfactory vehicles for exposition when compared to the power of, for instance, Microsoft Word. Images, fonts, call-outs, complex page layouts: all of these things should be available to the software essayist.
Unfortunately, as anyone who's written for publication on software development knows, validating and maintaining the integrity of source code embedded in an Office document can be decidedly non-trivial. Now that authors are routinely charged with producing camera-ready pages and are responsible for fine-tuning line and page breaks while simultaneously trying to track APIs evolving in public, validating a book-length text can take hours.
To be really functional, a literate programming tool has to allow source code to be interleaved with the word-processing document and extracted and compiled at whim. Traditionally, literate programming tools have worked by embedding special tags or escape codes within text-based document formatting languages (in particular, TEX). Those of us who prefer Word have been forced to cobble together various hacks using Word styles or hidden text.
The ease with which SmartTags, ActionPanes, and XML are programmed using Visual Studio Tools for Office 2005 allowed me to write a literate programming tool for .NET that is easy to use, supports both C# and VB.NET, and uses the full formatting power of Word.

Figure 1. A literate .NET Word document consists of code fragments and formatted text.
|
As Figure 1 shows, a "Literate .NET" Word document consists of a mix of normal word processing text and code fragments. Individual code fragments are themselves mixtures of source code and references to other fragments. Within a fragment, references are distinguished from literal code by being set in chevrons, while a fragment that defines a reference is preceded by either "=" or "+=" depending on whether the fragment is a "definition" or "amendment" to a previous definition. A SmartTag recognizes the triple-line "equivalent to" character to provide easy access to "Check compile," "Save source to clipboard," and "Save XML to clipboard," functionality.
"What XML," you ask? Figure 2 shows the considerably more complex view of a source code snippet when "Show XML tags in document" is turned on. Even this is simplified, as each CodeBlock tag contains a "type" attribute specifying whether it is a "Literal" or "Reference" tag. At a higher level, Fragments are gathered into CompilationUnits (corresponding to source code files) and which, in turn, contribute to an AssemblyTarget. In addition to its CompilationUnits, an AssemblyTarget element has subtags which specify references, whether the target assembly is executable, and the programming language (these subtags are styled to use hidden text, so they don't appear on the printed page). The assembly of source code will work with any language, but compilation via the SmartTag is only available for C# and VB.NET.

Figure 2. A complex document view which shows the XML tags.
|
When the SmartTag is chosen, very simple code checks to confirm that the click occurred within an AssignmentType XML element and, if so, it navigates up to the containing AssemblyTarget element, retrieves all of the XML of that element (that is, all the source-code for the assembly), and then processes it. This is shown in Listing 1.
The «Navigate to AssemblyTarget» step is trivial, recursively using the ParentNode property of the currentNode. Once the XML is produced, a simple parser walks through the XML tree, assembling source code. When it gets to a CodeBlock element that has the Type attribute value of "Reference" it recursively walks the tree, looking for a Fragment element that defines the relevant identifier.
The most startling aspect of this project was how easy Visual Studio Tools for Office 2005 has made SmartTags, as shown in Listing 2.
This source code runs within the ThisDocument_Startup() method of the ThisDocument class generated by the VSTO Wizard. A few years ago, I beat my head against the wall trying to create a "Compile it!" SmartTag and my initial experiments with literate programming in VSTO 2005 were based on the use of "Save as XML…" in Word. I figured "What the heck," though, and decided to try using VSTO 2005. I had SmartTag functionality almost before I realized it!
The hardest aspect of the project was developing the XML schema to describe a program. I began by creating an object model, serializing it to XML, and then using Visual Studio 2005's tools to infer a schema. I then refined the schema by using it within Word to try to tag sample documents. When I would come to a problem, I would refine the schema by hand, delete the old schema from Word's schema library, import the new one, and try again.
Once a Word is tagged up with the Literate .NET schema, things work well. However, inserting the tags still takes too long. My future plans for the tool center on integrating macros and style formatting so that creating fully-tagged fragments will be as seamless as possible. I also hope to incorporate "smart" pasting of source code from Visual Studio (auto-outdent, better line-wrapping, etc.). I've also thought of adding parameters to reference code-blocks, but that train of thought borders on programming language design, and I want to keep the tool as general as possible.
One way or the other, the power of Visual Studio Tools for Office and .NET 2.0 will provide the tools. A literate "Hello, World" Word document is included in the downloadable source for this article, as is the source code for the current version of the tool. You'll need the .NET 2.0 beta to try out the application. Check my Web site for further notes on the evolution of the tool.