Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Working with Red-Black Trees in C#

Substituting red-black trees for simple binary search trees improves data balance and can result in faster searches, particularly for large and well-ordered data.

lthough binary search trees (BSTs) are used widely, as data gets added, binary search trees tend to degenerate into an unordered linked list over time. The "red-black tree" is a related type of binary tree that's often more suitable for certain types of search operations. This article discusses how red-black trees facilitate faster searches and become less disordered over time than similar data structures, and shows how to build and search a red-black tree in C#.

A tree is a data structure that represents hierarchical data by storing it in nodes that are related to one another in specific ways. The topmost node of a tree is called the root. It is the node from which all the other nodes of the tree descend. A tree can have one and only one root. Using the same logic, a tree should have at least one node—the root node. A BST is a special tree structure that ensures both that the tree nodes are ordered, and that each node contains a value. In a binary tree, each node may have only two children, called left and right internal nodes. For any given node, the left child node stores a value that is less than the current node's value, while the right child node's value is greater than the current node's value. Such trees have a "height," which you can calculate by counting the number of links from the root node to the deepest node (the one furthest away from the root) in the tree.

Figure 1. Simple Binary Search Tree: This nine-node tree illustrates the simple left-right decisions made when adding new nodes or searching the tree. (Image adapted from Wikipedia).
The tree in Figure 1 has nine nodes, and a height of three—the distance from the root node (8) to any of the nodes 4, 7, or 13. Note how all left nodes have values less than their parent nodes while right nodes store values greater than their parents. The nodes that have no children are called "leaf nodes." In Figure 1, the nodes that correspond to the values 1, 4, 7 and 13 are all leaf nodes.

To search for a specific item in a binary tree, you start at the root node and repeatedly take either the left or right branch depending on whether the value that you are searching for is greater than or less than the value of the current node. Search operations in a binary tree take O(h) units of time, where, h represents the height of the tree. A BST can easily degenerate into an unbalanced list when added nodes fall disproportionately into one branch of the tree, making that branch far longer than others, thus making searches take longer for that branch than for others. According to MSDN, "The disadvantage of BSTs is that in the worst-case their asymptotic running time is reduced to linear time. This happens if the items inserted into the BST are inserted in order or in near-order. In such a case, a BST performs no better than an array." This is where red-black trees fit into the picture.

A red-black tree is an optimized version of a BST that adds a color attribute to each node. The value of this color attribute value is always either red or black. The root node is always black. In addition to color, each node contains a reference to its parent node, and its child nodes—left and right, as well as an optional key value. Red-black trees perform better than ordinary BST because they use the color attribute and the node references to maintain a better balance.

Red-black trees always follow these rules:

  • The root node of a red-black tree is always black.
  • The path from the root of the tree to any leaf node contains the same number of black nodes throughout the tree, also known as the "black-height" of the tree.
  • Both children of a red node are always black.
  • All "external" nodes—leaf nodes—are always colored black.
The maximum height of a red-black tree that contains n nodes is given by 2log(n+1). The time taken to search for any item in a red-black tree follows the formula O(log n), which implies that it is a good choice for search operations. Figure 2 shows a typical red-black tree.

Figure 2. Red-Black Tree: This tree conforms to all red-black tree rules. (Image adapted from Wikipedia.)
Implementing a BST in C#
It's worth looking at the code for a binary search tree first; you can use it use it to see how the red-black tree implementation differs, and for comparison testing. The code in Listing 1 implements a simple binary search tree. It's interesting to look at the recursive code for searching the tree:

public Node Search(Node node, Object key) { if (node == null) return null; else { int result = String.Compare(key.ToString(), node.key.ToString()); if (result < 0) return Search(node.left, key); else if(result > 0) return Search(node.right, key); else return node; } }

The search method compares the string values of the key parameter and the key of the passed-in node. The result of that comparison sets up the recursive call to search when the key value is either less than (search the left child) or greater than (search the right child) the key value of the node parameter. As the search progresses down the tree, eventually either the key value will match a node value, resulting in a successful search, or the search will fail and the method will return null.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.