advertisement
Premier Club Log In/Registration
  Include Code  Search Tips
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   SKILLBUILDING  |   TIP BANK  |   SOURCEBANK  |   FORUMS  |   NEWSLETTERS
Browse DevX
The Content Development Kit
Martin Sawicki
Andrew Bishop
Partners & Affiliates
advertisement
advertisement
CoDe Magazine
Subscribe to CoDe Magazine
Average Rating: 4.5/5 | Rate this item | 8 users have rated this item.
 Print Print
 
Working with Microsoft Office Word 2003's XML
One of Microsoft Office 2003's most significant new features is the integration of XML technology. This article focuses on taking advantage of Word 2003's XML features from within your applications. 

advertisement
he .doc file format that is still present in Word 2003 is essentially a proprietary binary format; sadly, .doc files are difficult to extract information from. By saving documents in the new XML format, you can easily retrieve information trapped inside of Word 2003 documents by using little more than XPath queries.

New features included in Word 2003 also allow you to force users into entering data into an XML document without their knowledge! Essentially, you can annotate a document with an XML schema and then protect the document, only allowing the user to add or edit information in specific locations throughout the document. This way, when the user saves the document, the data is written directly to an XML document, allowing it to be easily consumed by another application or a database.

Another cool idea for using XML with Word 2003 documents is the ability to transform XML into other formats. As of this writing, there is an XSLT provided by Microsoft that takes a Word 2003 XML document and transforms it into an HTML document for viewing in a Web browser. Of course, my first reaction to this was "What good is that? I can save a document as HTML, right?" Then I realized that I have complete control over this transformation by designing my own XSLT, unlike the "Save as HTML" functionality from previous versions.

But these ideas are outside the topic of this article, which is focused on the ability to manipulate a Word 2003 document (saved as XML) from within code. Before Word 2003, all you could effectively do was to either use automation or to be really handy with the RTF format (and open the RTF using Word). With the ability of Word 2003 to both save as and read from XML, you can create sophisticated Word 2003 documents by processing and manipulating XML.

If you're not sure why you might try something like this, here are a few ideas:

  • You can create documents from data within an application, such as form letters.
  • You can send Word 2003 documents to a client workstation over the Internet as XML and have it correctly interpreted at the client workstation as a Word 2003 document.
  • You can return Word 2003 documents from Web services.
So, to get a better feel for how this may benefit your own applications, let's walk through the creation of a Word 2003 template, save it as XML, and then manipulate the document (using data provided by a user) to produce a final document for use in the application.

Creating a Schema
With the ability to save as and read from XML, you can create sophisticated documents by processing and manipulating XML.
The first step in this process is to create a schema for the data that you can insert into the Word 2003 document template. Although you don't actually need to have a schema, it's a bit easier to work with the document if you apply a schema to it. Without the schema, you'd have to use a feature like bookmarks, which are rendered like the following XML snippet:
   <aml:annotation aml:id="0" 
       w:type="Word.Bookmark.Start" 
       w:name="ContactName"/>
   <w:p>
     <w:r>
       <w:t>[ContactName]</w:t>
     </w:r>
   <aml:annotation aml:id="0" 
       w:type="Word.Bookmark.End"/>
   </w:p>
Notice how the bookmark, named ContactName in this example, is delimited by two empty annotation elements. The only things that distinguish these elements are the type attribute values of Word.Bookmark.Start and Word.Bookmark.End. This is slightly more complex than applying a schema to the document, which produces the XML in the following snippet:
   <ns0:ContactName>
     <w:p>
       <w:r>
         <w:t>[ContactName]</w:t>
       </w:r>
     </w:p>
   </ns0:ContactName>
Because I'm starting from scratch, the schema approach seems to be a slightly easier way to go. But I can imagine a situation where you are migrating your approach from an earlier version of Word and where your documents are marked up with bookmarks. As you can see, it's still possible to use the bookmarks, just a tiny bit more work than using an attached schema.

For example, using the Northwind Customers table from SQL Server, I've created a very simple schema that is listed in its entirety in Listing 1.

This simple schema points out another advantage to using a schema-based approach: Word 2003 enforces the restrictions defined in the schema for the document. Any violations appear as errors in Word 2003's task pane feature, but you can also validate the document against the schema with any XML validation tool.

The schema that you create can be as simple or as complex as you like. What is important is how to mark up the Word 2003 document with this schema so that you get the desired XML output from your application.

  Next Page: Making a Word 2003 Template
Page 1: IntroductionPage 3: Creating the Output
Page 2: Making a Word 2003 Template 
© Copyright Component Developer Magazine and EPS Software Corp., 2006
Untitled
advertisement
Advertising Info  |   Member Services  |   Permissions  |   Contact Us  |   Help  |   Feedback  |   Site Map  |   Network Map  |   About


JupiterOnlineMedia

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

Solutions
Whitepapers and eBooks
Intel PDF: Virtualization Delivers Data Center Efficiency
Intel eBook: Managing the Evolving Data Center
Microsoft Article: BitLocker Brings Encryption to Windows Server 2008
Symantec eBook: The Guide to E-Mail Archiving and Management
Microsoft Article: RODCs Transform Branch Office Security
Go Parallel Article: James Reinders on the Intel Parallel Studio Beta Program
Avaya Article: Advancing the State of the Art in Customer Service
Adobe Acrobat Connect Pro: Web Conferencing and eLearning Whitepapers
Avaya Article: Avaya AE Services Provide Rapid Telephony Integration with Facebook
Go Parallel Article: Getting Started with TBB on Windows
HP eBook: Storage Networking , Part 1
MORE WHITEPAPERS, EBOOKS, AND ARTICLES
Webcasts
Intel Seminar: Efficiencies in Hardware/Software Virtualization
HP Webcast: Disaster Recovery Planning
Go Parallel Video: Performance and Threading Tools for Game Developers
HP Video: StorageWorks EVA4400 and Oracle
HP Webcast: Storage Is Changing Fast - Be Ready or Be Left Behind
MORE WEBCASTS, PODCASTS, AND VIDEOS
Downloads and eKits
IBM TCO eKIT: Your IT Budget is Under Attack, Get in Control
IBM Energy Efficiency eKIT: Learn How to Reduce Costs
30-Day Trial: SPAMfighter Exchange Module
Red Gate Download: SQL Toolbelt and free High-Performance SQL Code eBook
Iron Speed Designer Application Generator
MORE DOWNLOADS, EKITS, AND FREE TRIALS
Tutorials and Demos
Microsoft Article: Silverlight Streaming--Free Video Hosting for All
Featured Algorithm: Intel Threading Building Blocks - parallel_reduce
HP Demo: StorageWorks EVA4400
MORE TUTORIALS, DEMOS AND STEP-BY-STEP GUIDES