According to observation 1 from the previous paragraph, you can get the maximum amount of binary numbers that substitute the sequences of 1's by splitting the compressed message, as Listing 1
If you execute this procedure with the sample input '10000010011110011' as a parameter, you will get the following result:
numID binNum startPos endPos
----------- ---------------------------------------- ----------- -----------
1 100000 1 6
2 100 7 9
3 111100 10 15
4 11 16 17
This result of split is clearly not the only possible answer. It gives you the maximum amount of binary numbers that substitute the sequences of 1's, where each number in turn is a maximum from a few possible binary numbers. Indeed, each binary number that has more than one zero at the end will be just a maximal number from a few potential candidates for the answer.
For example, the third number (111100) can be 1111 (substitution for fifteen 1's) and 00, or 11110 (substitution for thirty 1's) and 0. However, it cannot be 111100 (substitution for sixty 1's), because the next two 1's would be left out of the compression.
Another portion of entropy comes from the fact that each piece in the result can be combined with its neighbor in the set. For instance, instead of 1, 2, 3, 4 (100000, 100, 111100, 11), you could have 1-2, 3, 4 (100000100, 111100, 11) or 1, 2-4 (100000, 10011110011) and so on. Thus, your task becomes as simple as finding all possible combinations of binary numbers, the replacements, and then for each found number, all possible variations caused by extra zeros.
Building the Model
Assume that each portion of a split-compressed message (see Listing 1) is an edge in a directed edge-labeled graph. For simplicity, the edges of the graph are labeled using numID numbers (1, 2, 3, 4) instead of binNum binary numbers. When portions are merged, they will be labeled with the first and last numbers participating in the merge. Figure 1 illustrates this graph for four numbers (1, 2, 3, 4).
|Figure 1. Graph of Possible Combinations for Four Portions of the Message|
This graph is quite simple. It has only eight distinct combinations for the original four portions. However, it can become very complicated when the number of portions starts to grow. In general, the number of combinations is equal to 2n-1, where n is the maximal amount of the portions received after split. Since the maximal length of the compressed message cannot exceed 40 bits (recall the problem description) and because you can have the message 1010101010101010101010101010101010101010, the maximal amount of split portions will be 20. That number of portions will give you 219 = 524288 possible combinations.
Note that all the edges labeled 1 or 1-... in Figure 1 begin from the same initial node. Similarly, all the edges that have the number 4 in the labels come to the same final node.
In fact, the edges are strictly directed. The label for each current edge cannot be numbered with a number that has been already used to label the preceding edge. In addition, the first number (used to label the current edge) should be greater than the last number (used to label the previous edge) by 1.
Another illustration of the model built to resolve the ACM problem is a collection of trees or forest, which is also a graph. Figure 2 shows such a forest for four portions of the compressed message.
|Figure 2. Forest: All Possible Combinations for Four Portions of the Message|
As previously mentioned, a greater number of portions can make the graph and the forest shown on Figure 1 and Figure 2 respectively very complicated. Figure 3 illustrates a fragment of a forest built for seven portions of the message.
|Figure 3. Fragment of the Forest for a Message Divided into Seven Portions|