devxlogo

From MSXML3.0 to .NET XML Classes: A Quick Guide

From MSXML3.0 to .NET XML Classes: A Quick Guide

Introduction

As you have probably heard thousands oftimes, XML is a core technology in .NET. XML is a cornerstone of web services and ADO.NET recordsets (now calleddatasets) are internally stored as xml. You have great support for XML manipulation in .NET via its XML classes. In this article I want to provide a quick guide to migrate from msxml3.0 to XMLclasses since there have been some changes in how you can accessfunctionalities such as XPATH and XSLT, how you can navigate an XML tree andhow you can persist and load and  XMLdocuments.

Loading andpersisting XML documents

The w3c recommendations does not mandate aspecific set of API to load and persist an XML document into the DOM. Each XMLparser is free to implement its set of methods to accomplish this task.In the msxml parser you can load into the Dom an XMLdocument from a file using the load method while the loadxmlmethod loads the DOM from a string. To persist an XML document into its”string form” to a file you can use the save method.
In .NET the DOM api is implemented in the XMLDocument object. XMLDocumentexposes a loadxml that behaves exactly as in the msxml30 parser, the loadmethod is slightly changed instead. This method exposes 3 overload . The firstone takes a string (to load from a URL or a file), the second takes anXmlReader object and the third a TextReader object. Both these objects, as wewill see later, provide a way to navigate the XML tree in a more efficient way then  theDOM. 
Note that both XmlReader and TextReader objects are abstract classes. XmlReaderis implemented in a derived class called XmlTextReader. You can use theXmlTextReader object to load XML from a file in the following way:

reader = New XmlTextReader ("c:myxml.xml")Dim xmldocument As XmlDocument = new XmlDocument()xmldocument.Load (reader)

XmlDocument exposes a save method with 3 overloads that are symmetric tothe load method; they take, respectively, a string (the file name), aXmlWriter and a TextWriter. Again, XmlWriter and TextWriter are abstractclasses. The XmlTextWriter class is a concrete one that implements XmlWriter.

Navigating a XML Document

The XmlTextReader class we have seen beforeprovides an alternative  way to thestandard Dom api (implemented in the XmlDocument object) to navigate the XMLdocument with a fast, read-only, forward-only cursor. XmlTextReader is analternative to SAX to provide a light-weighted solution toXML navigation without requiring complex state management to the XML consumerprogram.
A SAX parser can be built on top of the XMLTextReader object. Microsoft haspromised to provide a full working sample that will show how to do this.
Using the XMLTextReader object is pretty simple:

While reader.Read()If reader.NodeType = xml.XmlNodeType.Text thenConsole.Write(reader.Value)Console.WriteLine()End ifEnd While

Note that the XmlReader is similar to the DataReader provided by ADO.NET

The Dom api provides some methods to support searching and filtering on the XMLtree, but these methods are rather cumbersome and you need to write lot oftedious code to use them. As you probably know XPATH is a w3c standard that letsyou extract a generic set of nodes from an XML-tree using a declarative syntax,similar to the one you use to navigate a file system. By instance, the syntaxto extract a node-list containing all the nodes whose name is book and thathave an attribute named author with a value of John is:

//book[@author='John']

This is the abbreviated XPATH syntax. To express the same select criteria withthe full XPATH syntax you would have written.

/descendant::book[attribute::author='John']

XPATH does not mandate a specific api toexpose its functionalities to programmers. 
In msxml30 XPATH functionalities are exposed through a couple of methods thatextend the DOMDocument specifications: selectnodes and selectsinglenodes.These two functions take as input a string that must evaluate to an XPATHexpression and returns, respectively, a node-list and a node. 
In beta1 XPATH functionalities have been moved from the XMLDocument class to anew class called XmlNavigator.
The XmlNavigator class provides an alternative to the DOM api to navigate andedit an XML document.
Conceptually, the XMLNavigator extends the XmlReader class adding random accessnavigation (without building a node tree, but a single node on demand whenmoving to it). 
You apply XPATH selections using the select / selectsinglemethods. These methods does not return a node-list, on the contrary, they justrestrict the underlying data the XmlNavigator cursor can access (like a filterin an ado recordset). To acquire the selected nodes by an XPATH expression the MovetoNextSelectedmethod must be called, as shown in the following sample.

Dim nav as new DocumentNavigator(doc) 'where doc is a XMLDocument
Nav.select("//book[@author='john']")Nav.MoveToSelected()...etc&Nav.MoveToDocument 'reset the cursor context to the whole document

Furthermore, XmlmNavigator  providesmethods to edit the XML-tree.
The XmlNavigator class is abstract. It is implemented in the DocumentNavigatorclass that accepts a XMLDocument in its constructor and in theDataDocumentNavigator that accepts a XMLDataDocument class (the XML counterpartof an ADO DataSet, more on it later)
This situation is likely to change in beta2. According to a post I’ve read inthe Microsoft XML NET newsgroup, editing features will be removed from theXMLNavigator class, additionally, select and selectsingle methodswill be exposed from the XmlDocument class.

XMLTransformations

XSLT is a template based language (with aXML compliant syntax) to trasform XML documents into another XML document, aHTML document or even to a non XML file (like a csv file). 
In msxml30 XSL transformations are supported, again, extending the DOM api witha couple of methods : transformnode and transformnodetoobject.You need to load the XML and the xsl files into two different dom instances andthen call 

Output = Xmldoc.TransfromNode (Xsldoc)

There is actually an alternative  method toperform transformations in msxml30 that let’s you cache compiled stylesheets ifyou need to apply the same transformation to different documents.
XSL transformations are applied in .NET via the XSLTransform class.
The Transform method of this class transforms the XML structure passedas an XMLNavigator object. The transformation result is provided as aXmlReader, a XmlWriter, a TextWriter or a Stream.
This sample transform an XML document into an HTML document that is directlytransferred to the ASP response object.

Dim XMLDoc as new XmlDocument
XMLDoc.load(“c:myxml.xml”)
Dim xslt as New
XslTransform 
Xslt.load(“c:myxsl.xml”)
Dim nav as new DocumentNavigator(XMLDoc)
Xslt.Transform(nav,Nothing,response.output)
 

The 2nd parameter can be used to pass parameters into the XSL transformation.Note that this feature is not yetsupported in beta1.

XML andDatasets

XML and Dataset (the evolution of Adodisconnected recordsets) have a lot in common. The Dataset holds internally itsdata in a XML format. The XMLDataDocument is a specialization of theXMLDocument class that provides a relational view of the XML structure; you canget a DataSet from it calling its DataSet property.

Dim doc as new XmlDataDocument
Dim myDs as DataSet
Doc.load(“c:myxml.xml”)
MyDs = Doc.DataSet
Dim Books as DataTable
Books = DMyDs.Tables(“books”)

When you load XML data into a Dataset a schema is required to map an XML-treeinto relational data. You can provide this schema calling the LoadDataSetMapping.If you don’t do so, the XmlDataDocument object tries to infer the schema fromthe XML data structure. 
You can move on the opposite direction, feeding an XMLDataDocument with aDataSet, via one of the two constructors of the XMLDataDocument

Dim doc as new XMLDataDocument(myDataSet)

Conclusions

In this article I’ve provided a quick startto XML support in the .NET framework in order to show you how to perform in.NET the common XML tasks you currently do with the COM based Microsoft XMLparser.
I suggest you to spend some of your “spare” time (a.k.a . aftermidnight) playing around with .NET XML support classes: due to the central roleXML will play in the .NET world, XPATH, XSLT and schema will be as important asmarshalling, interfaces and threading models are in the COM world.