TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
 Specialized Dev Zones Research Center eBook Library .NET Java C++ Web Dev Architecture Database Security Open Source Enterprise Mobile Special Reports 10-Minute Solutions DevXtra Blogs Slideshow
 Sign up for e-mail newsletters from DevX

 Author Feedback Print Article Comment on this Article

# Compress Binary Messages Using SQL : Page 3

## Data compression is a great method for maximizing data storage space and making data communication faster. However, compression and decompression of binary data sometimes can be quite tricky. Learn a few useful data-compression techniques.

 by Alex Kozak
 Sep 30, 2006
 Page 3 of 3

### WEBINAR:On-Demand

Building the Right Environment to Support AI, Machine Learning and Deep Learning

#### Message Decompression

When you have a table with original and compressed values, you can easily decompress data. All you need to do is query the table and find the original message corresponding to what you just received. Listing 9 is an example of message decompression.
```
Listing 9. Message Decompression
```
DECLARE @compresSeq VARCHAR(20);
SELECT @compresSeq = '10001011';
SELECT * FROM binWithCompr
WHERE compresSeq = @compresSeq;

Result:

decNum      originSeq                         compresSeq
----------- --------------------------------- ---------------
139         10001011                          10001011
279         100010111                         10001011
491         111101011                         10001011
983         1111010111                        10001011
18431       100011111111111                   10001011
63487       1111011111111111                  10001011
1048571     11111111111111111011              10001011
``````

Oops, there's a problem. Seven different original messages can be compressed to the same binary number: 10001011. That means you need to get more information in order to find the correct answer. If you knew the number of 1's in the original message, it would definitely help. Listing 10 shows how you can find the number of 1's and/or 0's in all binary numbers in the table using only one query.

```
Listing 10. How to Find Number of 1's and/or 0's in Binary Messages
```
DECLARE @compresSeq VARCHAR(20);
SELECT @compresSeq = '10001011';

SELECT COUNT(*) AS num_of_1s,
SUBSTRING(t2.originSeq, t1.decNum,1) AS binVvalue,
t2.originSeq,
t2.compresSeq
FROM (SELECT decNum FROM binWithCompr
WHERE decNum <=
(SELECT MAX(LEN(originSeq)) FROM binWithCompr
WHERE compresSeq = @compresSeq)) t1
CROSS JOIN
(SELECT originSeq, compresSeq FROM binWithCompr
WHERE compresSeq = @compresSeq) t2
WHERE SUBSTRING(t2.originSeq, t1.decNum,1) = '1'
GROUP BY  SUBSTRING(t2.originSeq, t1.decNum,1), t2.originSeq, t2.compresSeq;

Result:

num_of_1s   binVvalue 	originSeq 		compresSeq
----------- -------------- ------------------- --------------------
7           1         111101011                10001011
12          1         100011111111111          10001011
19          1         11111111111111111011     10001011
8           1         1111010111               10001011
4           1         10001011                 10001011
5           1         100010111                10001011
15          1         1111011111111111         10001011
``````

As you can see, the information about the number of 1's in the original message received with the compressed message can help you uniquely identify the original message. But this is not always correct. In some cases, different original messages with the same number of 1's can be converted into the same compressed message. For example, if you run the script from Listing 10 for @compresSeq = '1010101', you will get the following result:

``````
num_of_1s   binValue 	originSeq  	compresSeq
----------- --------- 	------------ -------------
7           1         	111110101    1010101
10          1         	11111011111  1010101
7           1         	101011111    1010101
7           1         	101111101    1010101
4           1         	1010101      1010101
``````

As you can see, three original messages have seven 1's and two 0's, which can be compressed into the message 1010101. Therefore, there is no way to uniquely identify the original message (unless you have some additional information about it). All that means is you should use this type of compression only in situations where some data loss is acceptable (lossy data compression/decompression such as in image processing and transmission).

#### Message Size Matters

The example uses a hash table I created to make the decompression process easier. For obvious reasons, the described approach can be applied only to relatively short (2-4 bytes) binary messages. The number of binary combinations in 32 bits is 232 = 4,294,967,296. Generating that amount of binary numbers is difficult but doable. Generating all binary combinations for messages that are hundreds or thousands of bits in length, however, is impossible.

Nevertheless, you can use the techniques shown in this article in many data compression implementations and/or when you need to manipulate binary data.

Alex Kozak is a senior DBA/analyst working for SAP Canada. He has more than 15 years of database and programming experience. Microsoft has included some of his articles in the MSDN Library.
 Author Feedback Email Article Print Article Comment on this Article
Thanks for your registration, follow us on our social networks to keep up-to-date