Mapping Style Names to Element Names
Each Word paragraph object has a Style property that returns a Style object. So as you iterate through the paragraphs, you want to obtain the Style object and retrieve its name. It turns out that Style objects don't have a
Name property, they have a
NameLocal property instead, which corresponds to the name you see when you select a style from Word's dropdown style listand that's exactly what you need. Because the paragraph returns an Object, you must cast it to a Word.Style object to use the
NameLocal property in your code.
Dim stylename As String = CType(p.Style, _
Word.Style).NameLocal
Now that you have the style name for this paragraph, you want to map it to an XML element. There are two considerations. First, the Word style names can contain spaces, while XML element names cannot; therefore, you must either remove or replace the spaces before applying the name to an XML element.
Second, you may not
want to map the Word style names directly to XML element names. For example, you might want to map Word's Normal style to a
<p> element in the XML document. To do that, you need to write a bit of lookup code to map style names to element names. The sample application contains a StyleMapping class that performs the lookup (see
Listing 3). For convenience, the StyleMapping class also contains a
fixupName method that handles replacing any spaces in the Word style name with underscores.
To instantiate an instance of the StyleMapping class, pass it the name of an XML-formatted map file. Map files consist of a root
<mapping> tag, which contains any number of
<item> tags. Each <item> tag has
style and
tag attributes that hold the Word style name and the corresponding name of the XML tag that will hold a paragraph of that style.
<?xml version="1.0" encoding="utf-8" ?>
<mapping>
<item style="Heading 1" tag="h1"></item>
<item style="Normal" tag="p"></item>
</mapping>
For example, the preceding map file instructs the application to map the Word style "Heading 1" to an
<h1> element and to map the Normal style to a
<p> element.
As written, the application always attempts to look up the style name for every paragraph by calling the
StyleMapping.GetStyleToElementMapping method. If that method finds an
<item> element with a matching
style attribute, it returns the value of the
tag attribute; otherwise it "fixes up" the Word style name by calling the private
fixupName method and returns the result.
' definition in docToXml method
Dim styleMapper As New StyleMapping( _
Application.StartupPath & "\stylemapping.xml")
' for each paragraph, map the Word style to
' and XML element name
Dim elementName As String = _
styleMapper.GetStyleToElementMapping(stylename)
' In the StyleMapping class
Public Function GetStyleToElementMapping( _
ByVal aStylename As String) As String
Dim el As XmlElement = getMapNode(aStylename)
Dim tagname As String = String.Empty
If Not el Is Nothing Then
If el.HasAttribute("tag") Then
tagname = el.GetAttribute("tag")
End If
End If
If tagname = String.Empty Then
tagname = fixupName(aStylename)
End If
Return tagname
End Function
Private Function getMapNode( _
ByVal aStylename As String) As XmlElement
Dim n As XmlNode = _
xml.SelectSingleNode("//item[@style='" + _
aStylename + "']")
If Not n Is Nothing Then
Return CType(n, XmlElement)
Else
Return Nothing
End If
End Function
Private Function fixupName(ByVal aStylename _
As String) As String
Return aStylename.Replace(" "c, "_"c)
End Function
After obtaining a mapped name, you can create a new XmlElement and append it to the most recent page element.
Dim N As XmlElement = _
xmlDoc.CreateElement(elementName)
N.InnerText = s
pageNode.AppendChild(N)
When the docToXml
function has processed all the paragraphs, it returns the completed XML document. The Process button Click event handler code then displays it in the multi-line TextBox (see
Figure 3).
 | |
| Figure 3: The Completed Transformation. After processing, the simple sample.doc file, the multi-line TextBox displays the content transformed to XML. |