Build a Lightweight XML DOM Parser with C#

Build a Lightweight XML DOM Parser with C#

When you don’t need the full capabilities of an XmlDocument object, you can use an XmlTextReader, the SimpleElement class explained by Guang Yang, and a Stack to create the lightweight DOM. This version uses a strongly-typed SimpleElements collection wrapper to hold each SimpleElement’s children rather than the LinkedList class used by the Java version.

To parse a document, create a new SimpleDOMParser instance, and call its parse method, passing an XmlTextReader for the file you want to read, for example:

static void Main(){   XmlTextReader rdr;   SimpleDOMParser sdp;   SimpleElement se;   try {      rdr = new XmlTextReader       (@"some xml file path here");      sdp = new SimpleDOMParser();      se = sdp.parse(rdr);      rdr.Close();   }   catch (Exception ex) {      System.Diagnostics.Debug.WriteLine(ex.Message);   }}

The basic parser logic follows the same pattern as the Java or VB versions, but because the XmlTextReader already implements all the code needed to parse the XML text, the code is much simpler. The XmlTextReader class functions like a SAX parser in that it reads the document from start to finish, but rather than raise events for each node type as it encounters them, it sets properties that you can query to handle the various types.

public class SimpleDOMParser {   private XmlTextReader Reader;   private Stack elements;   private SimpleElement currentElement;   private SimpleElement rootElement;   public SimpleDOMParser(){      elements = new Stack();      currentElement = null;   }   public SimpleElement parse(XmlTextReader reader) {      SimpleElement se = null;      this.Reader = reader;      while (!Reader.EOF) {      Reader.Read();         switch (Reader.NodeType) {         case XmlNodeType.Element :            // create a new SimpleElement            se = new SimpleElement(Reader.LocalName);            currentElement = se;                  if (elements.Count == 0) {               rootElement = se;               elements.Push(se);            }            else {                     SimpleElement parent = (SimpleElement)                   elements.Peek();               parent.ChildElements.Add(se);               // don't push empty elements onto                // the stack               if (Reader.IsEmptyElement)                   // ends with "/>" {                  break;               }               else {                  elements.Push(se);               }            }            if (Reader.HasAttributes) {               while(Reader.MoveToNextAttribute()) {                  currentElement.setAttribute                      (Reader.Name,Reader.Value);               }            }            break;         case XmlNodeType.Attribute :            se.setAttribute(Reader.Name,Reader.Value);            break;         case XmlNodeType.EndElement :            //pop the top element             elements.Pop();            break;         case XmlNodeType.Text :            currentElement.Text=Reader.Value;            break;         case XmlNodeType.CDATA :            currentElement.Text=Reader.Value;            break;         default :            // ignore            break;      }   }   return rootElement;}

You use the returned SimpleElement rootElement to iterate through the tree. For example, the following method returns a string containing the indented XML tree.

private static void printTree(SimpleElement se,    StringBuilder sb, int depth) {   sb.Append(new string('	',depth) +        "<" + se.TagName);   foreach (string attName in se.Attributes.Keys) {      sb.Append(" " + attName + "=" + """ +       se.Attribute(attName) + """);   }   sb.Append(">" + se.Text.Trim());   if (se.ChildElements.Count > 0) {      sb.Append(System.Environment.NewLine);      depth +=1;      foreach(SimpleElement ch in se.ChildElements)       {         //sb.Append(System.Environment.NewLine);         printTree (ch, sb, depth);                  }      depth -= 1;      sb.Append(new string('	',depth) +          "" + System.Environment.NewLine);   }       else {      sb.Append("" +          System.Environment.NewLine);   }}

The SimpleElement class is straightforward and mimics the functionality of the SimpleElement class explained in this article for Java and VB. The SimpleElements collection class is a simple-typed collection wrapper that inherits from CollectionBase. Each SimpleElement exposes Name and Text properties that

As you can see, the XmlTextReader considerably simplifies the code required to build the parser. Like the Java and VB versions, this implementation stores only elements, attributes, text content, and CDATA blocks, but you can easily modify it to handle any XML content you wish.

You can download the C# code here.

Share the Post:
Heading photo, Metadata.

What is Metadata?

What is metadata? Well, It’s an odd concept to wrap your head around. Metadata is essentially the secondary layer of data that tracks details about the “regular” data. The regular

XDR solutions

The Benefits of Using XDR Solutions

Cybercriminals constantly adapt their strategies, developing newer, more powerful, and intelligent ways to attack your network. Since security professionals must innovate as well, more conventional endpoint detection solutions have evolved

AI is revolutionizing fraud detection

How AI is Revolutionizing Fraud Detection

Artificial intelligence – commonly known as AI – means a form of technology with multiple uses. As a result, it has become extremely valuable to a number of businesses across

AI innovation

Companies Leading AI Innovation in 2023

Artificial intelligence (AI) has been transforming industries and revolutionizing business operations. AI’s potential to enhance efficiency and productivity has become crucial to many businesses. As we move into 2023, several

data fivetran pricing

Fivetran Pricing Explained

One of the biggest trends of the 21st century is the massive surge in analytics. Analytics is the process of utilizing data to drive future decision-making. With so much of

kubernetes logging

Kubernetes Logging: What You Need to Know

Kubernetes from Google is one of the most popular open-source and free container management solutions made to make managing and deploying applications easier. It has a solid architecture that makes