Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

Working with Microsoft Office Word 2003's XML : Page 3

One of Microsoft Office 2003's most significant new features is the integration of XML technology. This article focuses on taking advantage of Word 2003's XML features from within your applications.


advertisement
Creating the Output
Now that the template has been defined and annotated as desired, you can write a small program to read data from an XML file and merge this data with the template. For this example, I've used a console application (as I don't need a GUI) and chose Visual Basic .NET as the language.

First, look at the XML data in Listing 2 that I'll merge with the document. It contains a single record from the Northwind database on SQL Server.

To make the example easier, I've saved this as a file called NWData.xml. In the real world, I'd probably capture the desired data in a Web page or Windows application and then retrieve the data from a database instead of a disk file.

There are more elements in this XML file compared to what I've applied in the Word 2003 document. That means I'll have to be certain to skip these elements when processing the file; perhaps they'll be added to other document templates in the future.

The code (the complete listing is shown in Listing 3) uses the XMLDocument class from the .NET Framework to do the bulk of the work. The code starts by loading both the data file and the Word 2003 template file into separate XML DOM objects. The Word 2003 document (saved as XML) is loaded through a method of a class instantiated as the oProcess object.

Dim oProcess As New WordXMLTest Dim sDocPath As String Dim sDataPath As String Dim sSaveFile As String sDocPath = "sample2.xml" sDataPath = "NWdata.xml" sSaveFile = "OutFile.xml" Try 'load the WordXML into a DOM oProcess.LoadFile(sDocPath) 'load data into DOM xmlDataDoc.Load(sDataPath)

Next, select the nodes from the data document with a simple XPath query, and iterate through them with a For-Next loop. Note that this code only assumes that a single customer record exists in the XML file. If there are multiple customers, add another outer loop to iterate through each customer record.

'iterate through data nodes xmlNodes = xmlDataDoc.SelectNodes( _ "/results/customers/*") 'replace Word doc area with data If Not xmlNodes Is Nothing Then For i = 0 To xmlNodes.Count - 1 xmlNode = xmlNodes(i) sNodeName = xmlNode.Name sNewText = xmlNode.InnerText oProcess.ProcessNodes( _ sNodeName, sNewText) Next End If

For the ProcessNodes method, the desired node name and new text are passed as parameters. A separate method is used because in my template, I have the ContactName element in two locations within the document. I want to ensure that both of these locations are replaced with the same name.

So, in the ProcessNodes method, the specified node name is used to create XPath queries to retrieve lists of matching nodes. Then each query is executed with the SelectNodes method on the Word 2003 XML DOM object, oXMLWordDoc.


Public Sub ProcessNodes( _ ByVal sNodeName As String, _ ByVal sNodeValue As String) 'replace the node(s) in the document 'with the specified value Dim oNodeList As XmlNodeList 'get nodes that have 'embedded paragraph marks oNodeList = _ oXMLWordDoc.SelectNodes( _ "//ns0:" + sNodeName + "//w:p", _ oNSMgr)

The interesting part of the code is the XPath queries; there are two of them, to ensure that you catch all of the nodes with the specified node name. Because some of the nodes are within a single paragraph and others are embedded within a paragraph, there are queries to account for both situations.

If Not oNodeList Is Nothing Then FillNodes(oNodeList, sNodeValue) End If 'get nodes that do NOT have 'embedded paragraph marks oNodeList = _ oXMLWordDoc.SelectNodes( _ "//ns0:" + sNodeName, oNSMgr) If Not oNodeList Is Nothing Then FillNodes(oNodeList, sNodeValue) End If

The namespace prefix requires that the SelectNodes method specifies a NamespaceManager object, which is part of .NET's System.XML namespace. Otherwise, your SelectNodes query will fail with errors. The NamespaceManager object, stored in a property of the WordXMLTest class, is populated within the New method, so it runs when the WordXMLTest class is instantiated.

Word 2003 enforces the restrictions defined in the schema for each document.
The namespace URIs come directly from the Word 2003 XML file and may vary depending upon the target namespace declared in your schema and what Word 2003 assigns as a prefix to your schema.

The FillNodes method referenced in the ProcessNodes method receives a node list object and a new node value as parameters. It changes the contents of the specified nodes on the oXMLWordDoc object.

Private Sub FillNodes( _ ByVal oNodeList As XmlNodeList, ByVal sNodeValue As String) Dim i As Integer Dim oXMLNode, oInnerNode As XmlNode For i = 0 To oNodeList.Count - 1 oXMLNode = oNodeList(i) oInnerNode = oXMLNode.SelectSingleNode( _ "w:r/w:t", oNSMgr) If Not oInnerNode Is Nothing Then oInnerNode.InnerText = sNodeValue End If Next

The replacement actually occurs on the text between the <w:t> and </w:t> tags that appear within the specified node object. This ensures that no formatting is lost, as font and paragraph properties are specified in the elements that surround the <w:t> element.

The last bit is to take the modified XML and save it to disk with a different file name so that it can be viewed. This is done by calling the Save method on the Word 2003 XML DOM object:

'write out the new Doc file. oProcess.save(sSaveFile) . . . Class WordXMLTest Public oXMLWordDoc As _ New XmlDocument Public oNSMgr As _ New XmlNamespaceManager( _ oXMLWordDoc.NameTable) Public Sub save( _ ByVal sFileName As String) oXMLWordDoc.Save(sFileName) End Sub

The Final Output
After running the program, you should now be able to double-click the output file and see the output in Word 2003, as shown in Figure 3.



Chuck Urwiler is a Senior Developer and Consultant with EPS Software Corporation, where he develops applications using Office, Visual FoxPro, VB.NET, ASP.NET, XML, and SQL Server. Over the past several years, Chuck has provided training for thousands of developers as well as designing and implementing mission-critical applications using Microsoft tools and technology. He continues to share his knowledge with the developer community through presentations at user groups, Microsoft seminars, and Developer Conferences for VFP, SQL Server, and .NET. Chuck has served as a technical editor on several books regarding SQL Server, MSDE, and e-Commerce. Chuck is a Microsoft Certified Solution Developer (MCSD) and a Certified Database Administrator (MCDBA). .
Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Thanks for your registration, follow us on our social networks to keep up-to-date