In practice, databases pick node sizes that naturally fit the disk drives that hold them. For example, suppose a disk uses a 2KB block size. That means the disk grabs data only in chunks that are multiples of 2KB, so it takes just as long to read 2KB of data as it does to read 10 bytes of data. In that case, you may as well make a B-tree node take up as much space in the 2KB block as possible.
Suppose a tree's key values are 40-character strings (we'll use Unicode so that's 80 bytes each) and you store a 4-byte pointer with each value pointing to the corresponding record. Each branch within the tree is another 4-byte pointer. Then if each node contains up to 2 * K values, the nodes occupy 2 * K * (80 + 4) + (2 * K + 1) * 4 = 176 * K + 8 * K + 4 = 184 * K + 4 bytes. If you want this to be less than 2KB, then 184 * K + 4 < 2048 so K < (2048 – 4) / 184, which is a bit more than 11. That means you can store between 11 and 22 values per node and still store a node in one 2KB block.
In that case, each node's minimum degree is 11, so a tree containing N values would be log11(N) levels tall. For example, a one-million-node tree would be only log11(1,000,000) or fewer than six levels tall. That means to locate a value in a tree containing one million items, you would need to search at most six nodes in the tree, and that would require only six disk accesses. Inserting or deleting a value might take a bit longer, depending on whether bucket splits, sibling redistributions, and bucket merges occurred.
If you make nodes, use a larger multiple of the disk's block size. You can build even shorter trees. For example, it may not take much longer to read two adjacent 2KB blocks from a disk than it takes to read a single block. In that case, you could make the nodes twice as large and get trees that are even shorter. You might spend a tiny bit of extra time reading each node but you would need to read fewer nodes.
Another trick that can improve performance is to keep the tree's root and first-level nodes in memory instead of requiring you to read them from disk every time you need them. For the previous tree with 2KB nodes holding up to 22 values, that would require caching 23 nodes totaling 46KB of memory. By sacrificing a measly 46KB of space, you can remove two disk accesses from a search that normally requires up to six, thus cutting your search time by a third. You still need to write those nodes to disk if they change but at least your searching will be faster. (You could cache the next level of nodes too, but 22 * 22 = 484 nodes on the next level, so the memory would start to add up.)
Now You Know How B-trees Work
B-trees let you quickly search for values. They rebalance themselves as you add and remove values, so they always remain relatively short and wide and that makes searching for items in a B-tree extremely fast. For example, a B-tree containing one million values using nodes that each hold between 11 and 22 values needs to examine only six nodes at most before finding its target.
This article provided you a better understanding of how B-trees work and how databases build and maintain indexes. (In fact, many databases use B+trees but the main ideas are very similar.) To get an even better understanding, build one yourself! Or at least download the B-tree program that I built. It's available in the download area in both Visual Basic and C# versions. I didn't include the code in this article because it's fairly involved, requiring both top-down and bottom-up recursion with multiple routines calling each other to navigate through the tree.