### WEBINAR:

On-Demand

Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js

uppose you and your swashbuckling buccaneer friends attack a Spanish galleon. When the smoke clears you find that you have emerged victorious but now you must begin the more difficult and dangerous task of dividing up the booty. If the division isn't reasonably fair, you'll face a mutiny that makes the original skirmish look like a tea party.
The "booty division problem" can be tricky if the treasure includes many discrete items with different values. Sharing 10,000 doubloons is relatively easy but dividing a chest of jewelry with assorted values evenly can be difficult. Fortunately, pirates aren't generally known for their math skills, so perhaps no one will notice that the shares aren't worth the same amount. In computer science, the "booty division problem" usually goes by the more prosaic name "partition problem."

Here are a couple of slightly more realistic partition problem examples. Suppose you need to divide an estate among several heirs. The estate includes a house, some cars, and a boat, none of which can be divided. Estate lawyers have much better math skills than pirates, so if this division isn't reasonably equitable, you could be in trouble. Or suppose you need to assign software developers with different skill levels to several software projects. If the division isn't fair, you may face a mutiny of a different kind.
Dividing a collection of resources evenly can be a tough problem. For example, assume you have 20 items (pieces of treasure, estate assets, developers, or whatever) and you need to divide them into two groups. There are 2^20 or just over 1,000,000 possible ways to assign the items to the two groups. While many are obviously poor assignments (such as assigning all items to one group), many of the possible solutions will look sensible, so picking the best solution can be time-consuming.

As the number of items in the problem increases, the number of possible solutions grows by a power of two. With 30 items to divide, there are about one billion possible solutions; with 40 items, there are more than one trillion possible solutions. Even if your computer can evaluate one million solutions per second, it would take more than 11 days to check them all. You need only add a few more items to make solving the problem take years or longer.
While there's no fast way to enumerate every possible solution and pick the best one, there is a relatively simple model for studying this kind of problem and many others. You can model the assignment of items to one group or another as a decision tree.

This article explains how you can use decision trees to model some extremely complicated problems and describes a few tricks for searching them.
**Trees Please**

Often you can represent combinatorial problems such as the "booty division problem" by using a tree. Each node in the tree corresponds to a partial assignment of the items in the treasure to the buccaneers. Each branch corresponds to making a new assignment.

I'll illustrate by starting with a simple problem where you need to divide items with different values between two people. The tree's root node corresponds to an empty assignment.
The left branch out of the root node corresponds to assigning the first item to the first person. The right branch out of the root corresponds to assigning the first item to the second person.

At the second level of the tree, left branches correspond to assigning the second item to the first person and right branches correspond to assigning the second item to the second person.
More generally, at every node in the tree, the left branch represents assigning the next item to the first person and the right branch corresponds to assigning the next item to the second person. The result is a full binary tree with height equal to the number of items that you need to divide.

Figure 1 shows a decision tree for dividing four items between two people. The nodes below left branches are blue and labeled "0" to indicate that the corresponding items are assigned to Person 0. Nodes below right branches are yellow and labeled "1" to indicate that their corresponding items are assigned to Person 1.

| |

Figure 1. Divide and Conquer: You can model the "booty division problem" for two people with a binary tree. |

Each path from the root to a leaf node corresponds to a complete assignment of the items to the people. The node labels tell you to whom each item is assigned. For example, the leftmost path through the tree assigns every item to the Person 0. The path just to the left of the center in

Figure 1 assigns the first item to the Person 0 and the other three items to Person 1.

You can model many similar problems with decision trees. For example, in shipping, the "bin packing" problem is to pack a number of items of different weights into as few bins with fixed capacity as possible. For example, suppose you have 10 packages with weights between 10 and 20 pounds all destined for the same location, and you want to ship them in boxes that can hold 50 pounds each. How many boxes do you need?
You can model this problem with a decision tree where each branch represents putting an item in a different box. With 10 items, you might need as many as 10 boxes so the tree would have 10 branches at each node and a total of 10^10 = 10 billion nodes.

Variations on the bin packing problem consider the packages' two-dimensional areas, three-dimensional volumes, or exact weights and shapes. You can also model them with decision trees, although determining whether a selection of packages can fit in a box is much more complicated.
The knapsack problem poses another question that you can model with decision trees. The knapsack problem is, given a set of items of different weights and values, and a knapsack that can hold a certain amount of weight, what combination of items should you place in the knapsack to maximize the total value? This problem also has piratical applications (which jewels do you stuff in your sea chest before running from the sinking ship?) in addition to some real-world purpose. For example, which experiments should you load on a small satellite to get the best value?

For another example of decision trees, consider tic-tac-toe. You can represent each board position as a node. Each branch out of a node represents a player taking a particular square.

| |

Figure 2. Your Move: You can model the tic-tac-toe problem with decision tree. |

Figure 2 shows part of the decision tree for tic-tac-toe. The root node represents the initial empty board. The nine nodes on the next level represent player X taking one of the nine available squares. At the third level of the tree, player O can take one of the eight remaining squares so each second-level node has eight branches. The figure shows the third level nodes when player X picks the center square.

Each node in the fourth level of the tree has seven branches corresponding to the seven remaining squares that player X can take. The tic-tac-toe decision tree continues in this manner, with each node having one fewer branches than its parent node, until the tree's ninth level, where the nodes have no branches and represent complete assignments of every square (no more moves) to one player or the other. This tree has 9! = 362,880 nodes (not counting the root node) so it can take a while to search the tree completely.
You can model other games using similar decision trees, although the trees are truly enormous. During each turn a chess player can move one of up to 16 pieces in possibly several ways (for example, you can move a queen in up to 27 ways). If you could move only eight pieces four ways each, then each node would have 32 branches. A decision tree for only the first six moves would contain around 1 billion nodes. While the rest of this article shows some ways to search decision trees in general, this kind of game tree requires special algorithms.

Now that you have an idea of the enormous breadth of problems that decision trees can help solve, this article and the sample application focus on the two-person "booty division problem." You can apply its techniques to other decision trees.