### WEBINAR:

On-Demand

Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js

**Exploring the Maximal Flow Algorithm**
In addition to costs, links in some networks also have

*capacities*, which indicate how much of some quantity can flow through the link in a given period of time. For example, a street might be able to hold a maximum number of cars per minute, a wire might be able to hold a certain amount of current, and a storm drain might be able to swallow a certain number of gallons of water per minute.
In a maximal flow algorithm (also called "max flow"), the goal is to figure out how much flow the network can support between two points.

| |

**Figure 5. Fantastic Flows:** In the maximal flow problem, the goal is to find the largest possible flow from one node to another. |

For example,

Figure 5 shows a grid-like storm drain network. Blue links show the maximal flow from node

0,

0 to node

3,

4. The numbers on a link indicate the link's flow and capacity. For example, the

2/4 on the upper

0,

0-

1,

0 link means that link is currently carrying

2 units of flow but could carry up to

4.
Interestingly, not every use for maximal flow calculations models a physical situation. A particularly elegant use of maximal flow calculations performs work assignment. Suppose you have a group of employees

A,

B,

C, and so forth who have particular skills. You also have a set of jobs

1,

2,

3, and so forth that require particular skills. Without maximum flow calculations, matching employees to jobs so you can perform as many jobs as possible can be tricky, particularly if you have lots of employees, jobs, and skills.

To assign jobs with a maximal flow calculation, you can use a network. Create a node for each employee and a node for each job. Connect each employee to the jobs that he or she has the skills to perform. Create artificial start nodes and connect them to each employee. Finally, connect each job to an artificial end node. Give every link a capacity of 1 and calculate the maximal flow from the start node to the end node. The calculation will find an assignment of employees to jobs that lets you do as many jobs as possible. The flows even show which employee to assign to each job.
Figure 6 shows an assignment of four employees to four jobs. In this example, employee A is assigned to job 1, employee C is assigned to job 2, and employee D is assigned to job 3. (The caption on the C-2 link looks like it says 0/1 but that caption is actually for the D-1 link sitting on top of it. The C-2 link really has flow 1/1. You can tell because it's blue.) You can rearrange the assignments to change which jobs get done (for example, employee C could perform job 4 instead of job 3), but there is no assignment that will get all four jobs done.

| |

**Figure 6. Marvelous Matching:** You can use a maximal flow calculation to assign employees to jobs. |

You could probably find a good match for the case shown in

Figure 6 by hand, but imagine that you have to do the employee/job matching task with a few hundred employees and jobs instead. In that case, a maximal flow algorithm could save you a lot of time and trouble.
Finding maximal flows is a bit more complicated than traversing a network or finding minimal spanning trees. The keys to this algorithm are the ideas of

*residual capacity* and

*augmenting paths*.

A link's residual capacity is the amount of flow that you could add to the link. You can think of "unused capacity" if you like ("residual" just sounds more official), although residual capacity has one odd feature that makes "unused" seem not quite right.
Every link in the network implicitly defines a reverse link that connects the same nodes—but in the opposite direction. This link's capacity is equal to the original link's flow at any moment in time. For example, in Figure 5 the 0,0-1,0 link has flow 2, so the reverse link has a residual capacity of 2. In some sense (which will hopefully become clear shortly), you could push 2 units of flow across this reverse link.

The links with residual capacities, together with the implicitly defined reverse links, form a *residual network*. An *augmenting path* is a path through the residual network that connects the start and end nodes.
An augmenting path may cross links that have unused capacity. Moving across such a link corresponds to adding some flow to that link. For example, if a link has flow 2 and capacity 5, its residual capacity is 3, so you can push 1, 2, or 3 additional units of flow across the link.

An augmenting path may also cross implicitly defined reverse links. Moving across such a link corresponds to removing flow from the original link that defined it. For example, suppose a link has flow 2—so its reverse link has capacity 2. Moving across the reverse link corresponds to removing 1 or 2 units of flow from the original link. Removing flow like this may allow you to redistribute flow to use links that were previously underused.
Here's the algorithm:

Find an augmenting path in the residual network from the start node to the end node following links with residual capacities greater than zero.

| |

**Figure 7. Awesome Augmentation:** The maximal flow algorithm uses augmenting paths to improve flow during each step. |

Follow the augmenting path and find the smallest residual capacity of the links in the path. (Suppose the smallest residual capacity is

2.)

Follow the augmenting path again, adding flow to use up the smallest capacity of the links in the path. (Add 2 to each link's flow.) When you follow a reverse link, update the corresponding forward link by removing the corresponding flow from it.
Repeat until there are no more augmenting paths.

Figure 7 shows an example.
When the network contains no flows, the network's links have 0 flows so the implicitly defined reverse links have 0 capacities.

In the first step, because the reverse links have 0 capacities, finding a path is simply a matter of finding a path through the network's original links (shown on the left in Figure 7). One such path visits the nodes 0,0-0,1-0,2-1,2-2,2-3,2.
The links with the smallest residual capacity along this path have residual capacity 2 so the algorithm adds a flow of 2 along each of the links. The image in the middle of Figure 7 shows the updated flows.

The second step is harder to visualize. The blue links in the middle of Figure 7 implicitly define reverse links. All the blue links carry a flow of 2 so their reverse links have a capacity of 2. Now the program must find a path through the residual network using the unused capacity of the regular links plus these reverse links with capacity 2.
A good beginning for a path from the start node towards the end node visits the nodes: 0,0-1,0-2,0-2,1-2,2. At this point, you might like to go to node 3,2 to reach the end node but the 2,2-3,2 link is already at full capacity with a flow of 2 so you cannot push any additional flow across that link.

Instead of following the 2,2-3,2 link, the augmenting path follows the reverse link 2,2-1,2, which has capacity 2. The path can then reach the end node by visiting the nodes 1,3-2,3-3,3-3,2.
The links with the smallest residual capacity along this new path have residual capacity 2 so the algorithm pushes an additional flow of 2 along each of the links. For the regular links, it adds a flow of 2. For the reverse link 2,2-1,2, the algorithm removes a flow of 2 from the corresponding forward link 2,2-1,2. (Do you see how pushing flow across the reverse links rearranged the flow?)

The image on the right of Figure 7 shows the result. At this point, the links leaving the start node are at full capacity so there can be no more augmenting paths and the algorithm is finished.
In practice, you don't really need to build the reverse links. In fact, you don't really need to build the residual network at all. Instead when you consider a node, you can examine the links leaving and entering the node. When you consider moving across a link in its normal direction, you calculate the link's residual capacity by subtracting its flow from its capacity. When you consider moving across a link backwards, you simply use the link's current flow as its residual capacity.

The code in Listing 2 shows how program NetAlgs calculates maximal flows.
The code first resets properties used by the algorithm. Next it enters a loop that continues until there are no more augmenting paths.

Inside the loop, the code sets each node's IsInCl value to False, and sets its BestLink value to Nothing (null in C#). The program uses those values to build the augmenting path, so this removes any previously built augmenting paths. Similarly, the code resets each link's IsInTree property to False to remove it from previous augmenting paths.
Next, the program creates a candidate list to hold nodes in a growing "augmenting tree." It adds the start node to this list and then enters another loop that finds an augmenting path.

While the candidate list is not empty, the program removes the first node from the list and considers its links. If a link leads to a neighboring node that has not already been added to the candidate list, and if the link's residual capacity is greater than 0, then the program adds the node to the candidate list. It also sets the node's BestLink property so the program knows which link was used to add the node to the "augmenting tree."
The algorithm then examines the node's back links in a similar manner.

Before ending the inner loop, the program determines whether the end node is in the candidate list. If the end node is in the list, that means the program has found an augmenting path from the start node to the end node so it breaks out of the inner loop.
Outside the inner loop, the program determines whether it found an augmenting path. If the candidate list emptied before it found an augmenting path, then there are no more augmenting paths so the program breaks out of its outer loop.

Assuming it did find an augmenting path, the program follows it to find the smallest residual capacity along the path. It then follows the path again to update the flows on the links along the path.
The program repeats all these steps until it can find no more augmenting paths.

If you studied the code in the earlier article, "Network Know-How: Finding Shortest Paths," you'll find this code very similar to that used by the label-setting shortest path algorithm. Both algorithms examine the links leaving a node to build a tree rooted at the start node, but there are two main differences. First, the shortest path algorithm selected the next node to visit—so it followed the shortest possible link. The maximal flow algorithm doesn't care what link it follows; it will accept *any* augmenting path, so it just visits the next node in the candidate list. Unlike a label-setting algorithm it cannot make a mistake because any path is fine. And because it can't make a mistake, it doesn't need to go back and fix mistakes like the label-setting algorithm does.
Second, the maximal flow calculation must consider back links. A shortest path algorithm can only follow real links, not implicitly defined back links.