RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Taming Trees: B-Trees : Page 3

Learn how to build a B-tree similar to those used by databases to implement indexes.

Addition Accomplished
To add an item to the tree, you follow the same search path that you would take to find the item. You start at the root and move down through the tree until you find the value or you reach a leaf node. (For simplicity, assume the key values must be unique, so you won't find the value in the tree.) When you reach a leaf node, you add the value in the appropriate spot.
Figure 4. Budding Trees: New values are added to leaf nodes.

For example, suppose you want to add the value M in the tree shown in Figure 3. M comes between G and R, so you move down the branch between G and R to the node containing J, K, and P. This is a leaf node so you insert the M here. Note that you need to rearrange the values in the node so they stay in order. Figure 4 shows the result. The new value M is highlighted in pink. The value P, which was moved to make room for M, is highlighted in green.

Of course, the leaf node where you want to place a new value may be full already. For example, suppose you want to add the value N to the tree shown in Figure 4. That value belongs in the node where you just placed value M, but that node is full so there isn't room. To make room, you need to perform a bucket split.

In a bucket split, you take the 2 * K items in the bucket, plus the new item, and arrange them in order. You leave the leftmost K items in the existing bucket and move the rightmost K items into a new bucket. You then pass the middle item up to the parent node. The parent node adds this middle item to its list of items and adds the new bucket as a child.

Figure 5. Tree Rearranged: In a bucket split, a new bucket is created and several values are moved.

In this example, the leaf node's values, together with the new value N, are: J, K, M, N, and P. The leftmost values J and K remain in this leaf node, the rightmost values N and P go into the new node, and the middle value M moves up to the parent node. In the parent node, the existing value R must be moved right to make room for the new value M. Figure 5 shows the new tree. The new value N is highlighted in pink. The values M, P, and R, which were moved during the addition, are highlighted in green.

When you move an item up to the parent node, you don't know whether there's room in the parent node for the new item. If the parent node is full, you'll need to split it, too, and move another value up to its parent node. In the worst case, that node must also split, and the split moves all the way up to the root causing a root bucket split.

When the root node splits, the tree must create a new root that has two children: the old root and the newly created node. This gives B-trees the rather unusual feature of growing from the root rather than from the leaves. Growing from the root ensures that all of the leaf nodes are always at the same level in the tree.

Figure 6. Ripe Root: A bucket split can occur only if the root node is full.

In a large database, a root bucket split working its way all the way up the tree can take a considerable amount of time. Fortunately, root bucket splits don't happen very often. To force a root bucket split in the tree shown in Figure 5, you could add values S and U to fill the rightmost leaf node, and then V to split that node and fill the root node. Figure 6 shows the result.

Now you could add values A and C to get the tree shown in Figure 7 below.

Finally, you can make a root bucket split occur if you add another value to the leftmost leaf node in Figure 7. Figure 8 shows the new tree if you add the value F, causing the leftmost leaf and the root to split. If you compare Figure 7 with Figure 8, you'll see that most of the values in the tree have been rearranged.

Figure 7. Really Ripe Root: A bucket split can occur only if there is a path from the root to a leaf where every node is full.
Figure 8. Full of Nothing: After a root bucket split, a B-tree contains a lot of empty entries.

Causing the root bucket split in the tree shown in Figure 5 required the addition of six values: S, U, V, A, C, and F. Furthermore, I picked the values to quickly fill the root and force a split. Randomly chosen values might have landed in other leaf nodes without forcing a root bucket split, so in typical use a database can go quite a while without a traumatic split.

Also notice that the tree shown in Figure 8 contains a lot of empty space, so it will be quite a while before another root bucket split can occur.

Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date