RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX

By submitting your information, you agree that devx.com may send you DevX offers via email, phone and text message, as well as email offers about other products and services that DevX believes may be of interest to you. DevX will process your information in accordance with the Quinstreet Privacy Policy.


Taming Trees: B-Trees : Page 4

Learn how to build a B-tree similar to those used by databases to implement indexes.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Difficult Deletion
Adding a node to a B-tree can be messy, but deleting a value can be even messier. There are several cases to consider.
Figure 9. Easy Elimination: To remove item B, simply remove it from its node and slide keys C and D left to fill in the space.

First, in the easy case, you search the tree until you find the target item in a leaf node and you simply remove it. For example, suppose you want to remove the item with key value B shown in Figure 9.

In this case, you simply remove B from its leaf node and slide key values C and D over to fill in the hole. Figure 10 shows the result.

Figure 10. Sneaky Swap: To remove item E, first swap it with item D and then remove it from its new position in the leaf node.

But what if the value you want to delete isn't in a leaf node? For example, suppose you want to remove item E from the tree shown in Figure 10. If you remove item E from the root node, then there's no value in that node that you can use to decide whether you should go down the left or center branch when you later search for a value.

The solution is to replace E with a new value that you can use to decide which branch to go down. For the new value to work, it must be larger than any value down the left branch and smaller than any value down the middle branch. The rightmost value to the left of the target value E is the largest value in the tree that is smaller than E. In this example, that value is the value D in the tree's next level. In a bigger tree, the value might be many levels down, but it is always the rightmost value to the left of the target.

Figure 11. Balance Regained: After swapping values D and E and then removing value E, the tree is once again a B-tree.

To remove value E, find the rightmost value to its left (D) and swap it with the target value E. Then continue down the branch to the left to find the target E in its new position and remove it from the leaf node. Figure 11 shows the result.

Notice that the values are again in their correct order and you can use the value D to decide whether to go down the left or middle branch out of the root node.

That explains how to remove a value from a leaf node or an internal node, but what if a leaf node contains too few values? For example, suppose you want to remove the value C from the tree shown in Figure 11. If you just removed that value, then its leaf node would contain only one value and that would violate the B-tree rule that every node in this tree (except the root) must contain between two and four values.

The solution, at least in this example, is to take the node's remaining values (A), combine them with the values in the node's sibling (F, G, and H), and add in the value in the parent node that separates them (D). Putting all of these values together, you have five values: A, D, F, G, and H. That's enough to put the leftmost values (A and D) in the left node, the rightmost values (G and H) in the right node, and move the middle value (F) up to the parent node. Figure 12 shows the result.

Figure 12. Sibling Shuffle: If a leaf doesn't have enough values, you can move some values over from a sibling.

If the leaf's sibling doesn't have enough values to share, then you could try to borrow some from the leaf's other sibling if it has one. For example, if you were removing items from the middle leaf node in Figure 12, you could borrow items from either the left or right leaf.

There's one final case to consider. What if you remove a value from a leaf so that leaf has too few values, but neither of its siblings has enough values to spare? For example, suppose you want to remove the value A from the tree shown in Figure 12. That would leave that leaf with only one value. Normally, you would borrow an item from a sibling node, but this time the sibling has only two values so it can't afford to lose one.

The solution is to notice that if the sibling doesn't have enough values to share, then this node together with its sibling contain so few values that you can merge them into a single node. In this example, you take the values from those two nodes (A, D, G, and H) plus the value in the parent separating them (F) and you put them all in a single node. You remove the separating entry from the parent node and slide the parent's other values over to fill in the hole. Figure 13 shows the result.

Figure 13. Mandatory Merger: If a node contains too few values and its siblings don't have any to spare, merge the node with a sibling.

Remember when adding a new value causes a node to split you can pass a value up to the parent node and sometimes that might cause the parent to split? An analogous situation can occur here. When you remove a value and two nodes merge, you remove a value from the parent node. That may mean the parent node contains too few values so you will need to redistribute its values with a sibling or merge it with a sibling. That would be the case in Figure 13 if the node containing the single value J were not the root node. (Remember the root node is exempt from the B-tree rules.)

Just as a bucket split can travel all the way to the tree's root, a bucket merge can also travel all the way to the root, and it can take quite a bit of time. Figure 14 shows a tree that is primed for merging. (It's the same as Figure 8, which just underwent a root bucket split.)

If you remove the value V from the tree shown in Figure 14, its leaf node contains only one value. Its sibling contains only two values, so it cannot share. The solution is to merge the rightmost two leaf nodes plus their separating value U.

Figure 14. Echoes of Emptiness: Every node in this tree (except the root) contains the minimum number of allowed values.
Figure 15. Topless Tree: After a root bucket merge, the tree grows one level shorter.

But that removes the value U from the parent node, so now that node contains only one value (R). That node's sibling contains only two values (C and G), so it cannot share. That means the node containing R must be merged with its sibling containing C and G, plus their separating value M. Figure 15 shows the new tree.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date