Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

Powerful, Flexible Text-Formatting Solutions in SQL Server : Page 3

You can build very powerful and flexible text-formatting solutions in SQL Server. Learn how to apply techniques that handle the simplest to the more complex text-formatting tasks.


advertisement

A More Complicated Task

Unlike the previous examples, which might look fun but not practical, the solution described in this section potentially is very useful. Suppose you work in a big software company with tens to hundreds of developers that are located in different departments, cities, or even countries. The developers have their own code-writing styles that can vary pretty widely. As a result, the source code for the company's products is hardly readable, and debugging it is very complicated.

To solve this problem, you start constructing naming and code-writing conventions. Unfortunately, they don't work well and you still have tons of code that break the conventions. So you decide to create some kind of black box program that receives the original source code as an input, formats it in accordance with rules that you defined, and then issues the formatted code as output.

For instance, you may want your program to identify keywords in the original listing (code) and start those keywords from a new line with certain indents. Now, really think about this task. It can have many, many rules and very complicated logic with hundreds of branches and conditions. Implementing such logic in your code will cost you a large number of work cycles. And even if you do it, any further change or new rule implementation will turn into a nightmare. Thus, you need to find a solution where the implementation code and the formatting rules are independent.



The following subsection discusses one possible solution.

The Keyword Identification Solution
Assume you have the following fragment of the code that you want to format:

IF @keyWord1 = '' BEGIN IF EXISTS(SELECT keyWord1 FROM rules WHERE keyWord1 = @keyWord2) BEGIN SELECT @str = CHAR(10) + @str + @keyWord2 SELECT @keyWord1 = @keyWord2 END ELSE BEGIN SELECT @str = @str + ' ' + @keyWord2 END END

First, you need to de-normalize the text by converting it and loading it into the table. I created a stored procedure spu_LoadTextTable (see Listing 4) for that purpose, but you can use a script or UDF.

Listing 4. De-Normalize Text; Create and Run spu_LoadTextTable

IF EXISTS (SELECT * FROM INFORMATION_SCHEMA.ROUTINES WHERE SPECIFIC_SCHEMA = N'dbo' AND SPECIFIC_NAME = N'spu_LoadTextTable') DROP PROCEDURE dbo.spu_LoadTextTable GO CREATE PROCEDURE dbo.spu_LoadTextTable @str varchar(8000) AS SET NOCOUNT ON -- preparation before splitting SELECT @str = REPLACE(@str, CHAR(13), ' '); SELECT @str = REPLACE(@str, CHAR(10), ' '); SELECT @str = REPLACE(@str, CHAR(9), ' '); -- split text using recursion technique WITH ProcessText AS (SELECT 1 as startPos, CHARINDEX (' ',@str, 1) as spacePos UNION ALL SELECT spacePos + 1, CHARINDEX (' ',@str, spacePos + 1) FROM ProcessText WHERE spacePos > 0) SELECT startPos, spacePos, SUBSTRING(@str, startPos, spacePos - startPos) AS singleWord INTO #words FROM ProcessText WHERE spacePos <> 0 UNION ALL SELECT MAX(spacePos) + 1, 0, RIGHT(@str, LEN(@str) - MAX(spacePos)) FROM ProcessText OPTION(MAXRECURSION 0); -- deleting extra spaces (rows with empty string in singleWord column DELETE #words WHERE singleWord = ''; IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].words') AND type in (N'U')) DROP TABLE words SELECT ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS wordID, singleWord INTO words FROM #words; SELECT * FROM words; GO SET QUOTED_IDENTIFIER OFF; GO EXEC spu_LoadTextTable "IF @keyWord1 = '' BEGIN IF EXISTS(SELECT keyWord1 FROM rules WHERE keyWord1 = @keyWord2) BEGIN SELECT @str = CHAR(10) + @str + @keyWord2 SELECT @keyWord1 = @keyWord2 END ELSE BEGIN SELECT @str = @str + ' ' + @keyWord2 END END" Result: wordID singleWord -------------------- ------------------- 1 IF 2 @keyWord1 3 = 4 '' 5 BEGIN 6 IF 7 EXISTS(SELECT 8 keyWord1 9 FROM 10 rules 11 WHERE 12 keyWord1 . . . . . . . . . . . . . . . . . . . . 35 + 36 ' 37 ' 38 + 39 @keyWord2 40 END 41 END

To deal with the rules, I created a rules table:

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].rules') AND type in (N'U')) CREATE TABLE rules( keyWord1 varchar(50) NULL, keyWord2 varchar(50) NULL, indent int DEFAULT 0)

Then I inserted the rules into the table:

INSERT INTO rules VALUES('IF', 'BEGIN', 0); INSERT INTO rules VALUES('BEGIN', 'IF', 3); INSERT INTO rules VALUES('IF', 'FROM', 3); INSERT INTO rules VALUES('FROM', 'WHERE', 0); INSERT INTO rules VALUES('WHERE', 'BEGIN', -3); INSERT INTO rules VALUES('BEGIN', 'SELECT', 3); INSERT INTO rules VALUES('SELECT', 'SELECT', 0); INSERT INTO rules VALUES('SELECT', 'END', -3); INSERT INTO rules VALUES('END', 'ELSE', 0); INSERT INTO rules VALUES('ELSE', 'BEGIN', 0); INSERT INTO rules VALUES('END', 'END', -3);

Here's how the rules break down: if keyWord1 comes after keyWord2, then keyWord2 has to start from a new line. The indent of the new line will be equal to the current indent value plus an indent value taken from the rule. For example, the last inserted rule (INSERT INTO rules VALUES('END', 'END', -3);) means if a found keyword 'END' has a predecessor keyword 'END', then a new line for 'END' has to start from the very last indent position minus 3.

Finally, I created a stored procedure spu_FormatText (again, it could be a script or UDF) that builds the result (formatted text) in accordance with the rules from the table (see Listing 5).

Listing 5. Formatting Text in Accordance with the Rules

IF EXISTS (SELECT * FROM INFORMATION_SCHEMA.ROUTINES WHERE SPECIFIC_SCHEMA = N'dbo' AND SPECIFIC_NAME = N'spu_FormatText') DROP PROCEDURE dbo.spu_FormatText GO CREATE PROCEDURE dbo.spu_FormatText AS SET NOCOUNT ON DECLARE @str varchar(8000) DECLARE @i int, @m int, @keyWord1 varchar(50), @keyWord2 varchar(50), @spChar int; SELECT @i = 1, @m = 0, @keyWord1 = '', @keyWord2 = '', @spChar = 0, @str = ''; WHILE (1=1) BEGIN SELECT @keyWord2 = singleWord FROM words WHERE wordID = @i IF @@ROWCOUNT = 0 BREAK IF @keyWord1 = '' BEGIN IF EXISTS(SELECT keyWord1 FROM rules WHERE keyWord1 = @keyWord2) BEGIN SELECT @str = CHAR(10) + @str + @keyWord2 SELECT @keyWord1 = @keyWord2 END ELSE SELECT @str = @str + ' ' + @keyWord2 END ELSE BEGIN IF EXISTS(SELECT keyWord1 FROM rules WHERE keyWord1 = @keyWord1 AND keyWord2 = @keyWord2) BEGIN SET @spChar = 1 END ELSE BEGIN SET @spChar = 0 END SELECT @m = @m + ISNULL(indent, 0) FROM rules WHERE keyWord1 = @keyWord1 AND keyWord2 = @keyWord2 IF @spChar = 1 BEGIN SELECT @str = @str + CHAR(10) + SPACE(@m) + @keyWord2 SELECT @keyWord1 = @keyWord2 END IF @spChar = 0 BEGIN SELECT @str = @str + ' ' + @keyWord2 END END SELECT @i = @i + 1 END SELECT @str GO ------------------------------------------------ EXEC spu_FormatText Result: ------------------------------------------------------ IF @keyWord1 = '' BEGIN IF EXISTS(SELECT keyWord1 FROM rules WHERE keyWord1 = @keyWord2) BEGIN SELECT @str = CHAR(10) + @str + @keyWord2 SELECT @keyWord1 = @keyWord2 END ELSE BEGIN SELECT @str = @str + ' ' + @keyWord2 END END

The beauty of the solution in Lisitng 5 is its independence. The solution is completely separated from the formatting rules, which are stored in the rules table. Such an approach allows you to change, add, or delete any rule without touching the code. For example, you can update the indent column by changing the starting position of the keyword. Alternatively, you can delete the rule from the table, in which case the deleted combination of two keywords won't be processed (remaining on the same line of the text). Also, you can add a new rule when your listing has a missing keyword.

Text Formatting to Suite Your Scenario

You can write very powerful and flexible text formatting solutions in SQL Server. In fact, you can apply the black box program in the previous section to any programming language, but you should consider it only as an idea, not a final product. Armed with the techniques in this article, creating the final product is up to you.


Alex Kozak is a senior DBA/analyst working for SAP Canada. He has more than 15 years of database and programming experience. Microsoft has included some of his articles in the MSDN Library.
Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap