isual Studio is chock full of classes to implement exotic data structures such as collections, hash tables, and dictionaries. One notable data structure that's missing is the tree. Trees are useful for storing hierarchical data where each node has one or more children. Many developers use a TreeView control or XML objects to build trees in .NET, but both of those methods are rather heavyweight and come with additional baggage that you may not need or want.
This article explains what trees are, how to build your own trees, how to work with them, and how to get them to do useful things such as drawing themselves, or listing their items in various sequences.
Before getting started, however, you need to understand some tree terminology.
Tree terminology is a cross between horticulture and genealogy. Horticulture supplies terms such as tree, node, root, branch, and leaf, while other tree terms come from genealogy, such as ancestor, parent, and child.
A tree is made up of nodes that are connected by branches. A node can have only one parent, but each node may have zero, one, or more child nodes. The branches are directed, which means you can tell which way they are pointing—so you can always discern which node is the parent and which node(s) are the children.
One node, called the root node has no parent. All other nodes in the tree are connected directly or indirectly to the root node. Normally, when you draw a tree, you draw the root at the top. That's the opposite direction from the way an ash tree or an elm tree grows in nature, but it's easier for most people to draw trees that way. With the root at the top, all branches are directed downward, so it's easy to see which nodes are the children and parents of the others.
A node's degree is its number of children. The degree of the tree is equal to the largest degree of any of the nodes. A tree with a degree of 2 is called a binary tree; a tree with a degree of 3 is called ternary. Beyond that, people usually call the tree "N-ary" (if they call it anything at all). For example, a degree 7 tree would be "7-ary."
A node at the bottom of the tree that has no children is called a leaf node. By definition, a leaf is a node with degree 0.
A node's level is its distance from the root node. To some people, that means the number of branches between the node and the root. Others count the nodes (including the node and the root), resulting in a level that's larger by one than you would get using the other method. The difference is whether you consider the root to be at level 0 or 1. (Much like the way the bottom floor of a building is floor 0 in the UK and floor 1 in the US.) For no particularly good reason, I'm going to consider the root node level to be 1.
The height or level of the entire tree is the same as the largest level of any of its leaf nodes.
Note that branches can only connect parent nodes with child nodes. Branches cannot connect "sibling" nodes (nodes with the same parent) and cannot connect nodes across "generations" or levels in the tree.
|Figure 1. Terrific Tree: A tree is made up of nodes connected by branches.|
One consequence of this rule is that the tree cannot contain any loops or cycles. Another is that there is one unique shortest path from any node to any other node. To find that path, you move up the tree from the starting node towards the root until you reach a node that is a common ancestor of both nodes. You will eventually find such a common ancestor, even if you have to go all the way to the root. After you find the common ancestor, you move down through the branches to the destination node.
Figure 1 shows a simple tree.
Using Figure 1, you should be able to verify that:
- Node A is the root node (it has no parent).
- Nodes D, E, G, H, and I are leaf nodes (they have no children; degree 0).
- The degree of the tree is 3 because node F has the most children—3.
- The tree's height is 4 because the farthest leaves (G, H, and I) are on the fourth level of nodes.
|Figure 2. Binary Bliss: In a full binary tree, every non-leaf node has exactly two children, except possibly one node—in this case node F.|
- The unique shortest path from node D to node H is D—B—A—C—F—H. (You can try finding paths between other pairs of nodes if you like.)
- There are no loops or cycles in the tree's branches.
tree is one where every node either has the full degree of the tree (i.e., as many children as possible) or has no children.
Depending on what you need to do with the tree, you may allow one exception. In that case, the second-to-bottom level has all leaf nodes on the right and the rightmost non-leaf node may have less than the maximum allowed number of children.
For example, Figure 2 shows a full binary tree. In this tree, every non-leaf node has two children except for node F.