A More Complicated Task
Unlike the previous examples, which might look fun but not practical, the solution described in this section potentially is very useful. Suppose you work in a big software company with tens to hundreds of developers that are located in different departments, cities, or even countries. The developers have their own code-writing styles that can vary pretty widely. As a result, the source code for the company's products is hardly readable, and debugging it is very complicated.
To solve this problem, you start constructing naming and code-writing conventions. Unfortunately, they don't work well and you still have tons of code that break the conventions. So you decide to create some kind of black box program that receives the original source code as an input, formats it in accordance with rules that you defined, and then issues the formatted code as output.
For instance, you may want your program to identify keywords in the original listing (code) and start those keywords from a new line with certain indents. Now, really think about this task. It can have many, many rules and very complicated logic with hundreds of branches and conditions. Implementing such logic in your code will cost you a large number of work cycles. And even if you do it, any further change or new rule implementation will turn into a nightmare. Thus, you need to find a solution where the implementation code and the formatting rules are independent.
The following subsection discusses one possible solution.
The Keyword Identification Solution
Assume you have the following fragment of the code that you want to format:
IF @keyWord1 = ''
BEGIN
IF EXISTS(SELECT keyWord1 FROM rules WHERE keyWord1 = @keyWord2)
BEGIN
SELECT @str = CHAR(10) + @str + @keyWord2
SELECT @keyWord1 = @keyWord2
END
ELSE
BEGIN
SELECT @str = @str + ' ' + @keyWord2
END
END
First, you need to de-normalize the text by converting it and loading it into the table.
I created a stored procedure spu_LoadTextTable (see Listing 4) for that purpose, but you can use a script or UDF.
Listing 4. De-Normalize Text; Create and Run spu_LoadTextTable
IF EXISTS (SELECT * FROM INFORMATION_SCHEMA.ROUTINES
WHERE SPECIFIC_SCHEMA = N'dbo'
AND SPECIFIC_NAME = N'spu_LoadTextTable')
DROP PROCEDURE dbo.spu_LoadTextTable
GO
CREATE PROCEDURE dbo.spu_LoadTextTable
@str varchar(8000)
AS
SET NOCOUNT ON
-- preparation before splitting
SELECT @str = REPLACE(@str, CHAR(13), ' ');
SELECT @str = REPLACE(@str, CHAR(10), ' ');
SELECT @str = REPLACE(@str, CHAR(9), ' ');
-- split text using recursion technique
WITH ProcessText AS
(SELECT 1 as startPos, CHARINDEX (' ',@str, 1) as spacePos
UNION ALL
SELECT spacePos + 1, CHARINDEX (' ',@str, spacePos + 1)
FROM ProcessText WHERE spacePos > 0)
SELECT startPos, spacePos,
SUBSTRING(@str, startPos, spacePos - startPos) AS singleWord
INTO #words
FROM ProcessText WHERE spacePos <> 0
UNION ALL
SELECT MAX(spacePos) + 1, 0, RIGHT(@str, LEN(@str) - MAX(spacePos))
FROM ProcessText
OPTION(MAXRECURSION 0);
-- deleting extra spaces (rows with empty string in singleWord column
DELETE #words WHERE singleWord = '';
IF EXISTS (SELECT * FROM sys.objects
WHERE object_id = OBJECT_ID(N'[dbo].words') AND type in (N'U'))
DROP TABLE words
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS wordID, singleWord
INTO words
FROM #words;
SELECT * FROM words;
GO
SET QUOTED_IDENTIFIER OFF;
GO
EXEC spu_LoadTextTable
"IF @keyWord1 = ''
BEGIN
IF EXISTS(SELECT keyWord1 FROM rules WHERE keyWord1 = @keyWord2)
BEGIN
SELECT @str = CHAR(10) + @str + @keyWord2
SELECT @keyWord1 = @keyWord2
END
ELSE
BEGIN
SELECT @str = @str + ' ' + @keyWord2
END
END"
Result:
wordID singleWord
-------------------- -------------------
1 IF
2 @keyWord1
3 =
4 ''
5 BEGIN
6 IF
7 EXISTS(SELECT
8 keyWord1
9 FROM
10 rules
11 WHERE
12 keyWord1
. . . . . . . . . . . . . . . . . . . .
35 +
36 '
37 '
38 +
39 @keyWord2
40 END
41 END
To deal with the rules, I created a rules table:
IF NOT EXISTS (SELECT * FROM sys.objects
WHERE object_id = OBJECT_ID(N'[dbo].rules') AND type in (N'U'))
CREATE TABLE rules(
keyWord1 varchar(50) NULL,
keyWord2 varchar(50) NULL,
indent int DEFAULT 0)
Then I inserted the rules into the table:
INSERT INTO rules VALUES('IF', 'BEGIN', 0);
INSERT INTO rules VALUES('BEGIN', 'IF', 3);
INSERT INTO rules VALUES('IF', 'FROM', 3);
INSERT INTO rules VALUES('FROM', 'WHERE', 0);
INSERT INTO rules VALUES('WHERE', 'BEGIN', -3);
INSERT INTO rules VALUES('BEGIN', 'SELECT', 3);
INSERT INTO rules VALUES('SELECT', 'SELECT', 0);
INSERT INTO rules VALUES('SELECT', 'END', -3);
INSERT INTO rules VALUES('END', 'ELSE', 0);
INSERT INTO rules VALUES('ELSE', 'BEGIN', 0);
INSERT INTO rules VALUES('END', 'END', -3);
Here's how the rules break down: if keyWord1 comes after keyWord2, then keyWord2 has to start from a new line. The indent of the new line will be equal to the current indent value plus an indent value taken from the rule. For example, the last inserted rule (INSERT INTO rules VALUES('END', 'END', -3);) means if a found keyword 'END' has a predecessor keyword 'END', then a new line for 'END' has to start from the very last indent position minus 3.
Finally, I created a stored procedure spu_FormatText (again, it could be a script or UDF) that builds the result (formatted text) in accordance with the rules from the table (see Listing 5).
Listing 5. Formatting Text in Accordance with the Rules
IF EXISTS (SELECT * FROM INFORMATION_SCHEMA.ROUTINES
WHERE SPECIFIC_SCHEMA = N'dbo'
AND SPECIFIC_NAME = N'spu_FormatText')
DROP PROCEDURE dbo.spu_FormatText
GO
CREATE PROCEDURE dbo.spu_FormatText
AS
SET NOCOUNT ON
DECLARE @str varchar(8000)
DECLARE @i int, @m int, @keyWord1 varchar(50), @keyWord2 varchar(50), @spChar int;
SELECT @i = 1, @m = 0, @keyWord1 = '', @keyWord2 = '', @spChar = 0, @str = '';
WHILE (1=1)
BEGIN
SELECT @keyWord2 = singleWord
FROM words
WHERE wordID = @i
IF @@ROWCOUNT = 0 BREAK
IF @keyWord1 = ''
BEGIN
IF EXISTS(SELECT keyWord1 FROM rules WHERE keyWord1 = @keyWord2)
BEGIN
SELECT @str = CHAR(10) + @str + @keyWord2
SELECT @keyWord1 = @keyWord2
END
ELSE
SELECT @str = @str + ' ' + @keyWord2
END
ELSE
BEGIN
IF EXISTS(SELECT keyWord1
FROM rules WHERE keyWord1 = @keyWord1 AND keyWord2 = @keyWord2)
BEGIN
SET @spChar = 1
END
ELSE
BEGIN
SET @spChar = 0
END
SELECT @m = @m + ISNULL(indent, 0)
FROM rules WHERE keyWord1 = @keyWord1 AND keyWord2 = @keyWord2
IF @spChar = 1
BEGIN
SELECT @str = @str + CHAR(10) + SPACE(@m) + @keyWord2
SELECT @keyWord1 = @keyWord2
END
IF @spChar = 0
BEGIN
SELECT @str = @str + ' ' + @keyWord2
END
END
SELECT @i = @i + 1
END
SELECT @str
GO
------------------------------------------------
EXEC spu_FormatText
Result:
------------------------------------------------------
IF @keyWord1 = ''
BEGIN
IF EXISTS(SELECT keyWord1
FROM rules
WHERE keyWord1 = @keyWord2)
BEGIN
SELECT @str = CHAR(10) + @str + @keyWord2
SELECT @keyWord1 = @keyWord2
END
ELSE
BEGIN
SELECT @str = @str + ' ' + @keyWord2
END
END
The beauty of the solution in Lisitng 5 is its independence. The solution is completely separated from the formatting rules, which are stored in the rules table. Such an approach allows you to change, add, or delete any rule without touching the code. For example, you can update the indent column by changing the starting position of the keyword. Alternatively, you can delete the rule from the table, in which case the deleted combination of two keywords won't be processed (remaining on the same line of the text). Also, you can add a new rule when your listing has a missing keyword.
Text Formatting to Suite Your Scenario
You can write very powerful and flexible text formatting solutions in SQL Server. In fact, you can apply the black box program in the previous section to any programming language, but you should consider it only as an idea, not a final product. Armed with the techniques in this article, creating the final product is up to you.