WEBINAR:
On-Demand
Application Security Testing: An Integral Part of DevOps
SQL Server 2005 Recursive Techniques (cont'd)
Now, go back to the
Listing 1. Replace lines 4 and 6 of the code with the following lines:
4 (SELECT 1 as startPos, CHARINDEX (' A',@str, 1) as spacePos
. . . . .
6 SELECT spacePos + 1, CHARINDEX (' A',@str, spacePos + 1)
Running this updated script, Listing 1 will produce the following result:
startPos spacePos singleWord
1 14 A lonely moon
15 30 about the sky,
31 62 A gentle flower in the breeze,
63 160 A giant cloud of smoky stars,. . .sweet voice of melody
161 175 And moist warm
176 189 air, which we
190 0 all breathe Tremendous power. . . everyone believe...
At first glance, the result doesn't make any sense; you just got all the phrases that start with a specific letter (A). But as you will see later, this option can be very useful when you want to format the text. Thus, when you split and load text into a table, you get easy access to each separate word, and that gives you a lot of flexibility and processing power.
But what if you need even more granularity? What if you need to work with each individual letter, instead of the whole word? Well, you can do that using the following script:
Listing 2. How to Convert by Letters and Load Text into the Table
SET QUOTED_IDENTIFIER OFF
GO
DECLARE @str varchar(2000);
SELECT @str =
"A lonely moon about the sky,
A gentle flower in the breeze,
A giant cloud of smoky stars,
Some sticky honey made by bees.
The soft sweet voice of melody
And moist warm air, which we all breathe
Tremendous power of our minds
Which can make everyone believe...";
1 SELECT @str = REPLACE(@str, CHAR(10), ' ');
2 SELECT @str = REPLACE(@str, CHAR(13), '');
3 WITH ProcessText AS
4 (SELECT 1 as startPos, SUBSTRING (@str,1,1) as letter
5 UNION ALL
6 SELECT startPos + 1, SUBSTRING (@str,startPos+1, 1)
7 FROM ProcessText WHERE startPos <= len(@str))
8 SELECT * INTO #letters
9 FROM ProcessText
10 OPTION(MAXRECURSION 0);
SELECT * FROM #letters;
Result:
startPos letter
----------- ------
1 A
2
3 l
4 o
5 n
6 e
7 l
8 y
9
10 m
11 o
12 o
13 n
. . . . . . .
249
250 b
251 e
252 l
253 i
254 e
255 v
256 e
257 .
258 .
259 .
260
The solution in Listing 2 is based on the same idea as the solution in Listing 1. The query recursively applies the SUBSTRING()
function to the original string, incrementing the starting position by one on each recursion's step, and stops processing when the starting position reaches the end of the string (text
).
The solution from Listing 2 gives you direct access to each letter in the text, which gives you additional flexibility and processing power. Now, you can easily determine the number of letters in the text:
SELECT COUNT(*) AS NumLetters
FROM #letters
WHERE letter <> '';
Result:
NumLetters
-----------
219
You can determine how often each distinct letter has been used:
SELECT COUNT(*) AS NumLetters,letter
FROM #letters
WHERE letter <> ''
GROUP BY letter;
Result:
NumLetters letter
----------- ------
4 ,
4 .
14 a
6 b
6 c
6 d
32 e
. . . . . . .
13 t
4 u
3 v
7 w
8 y
1 z
If you change Listing 2 slightly by replacing lines 4 and 6 with the following lines:
4 (SELECT 1 as startPos, SUBSTRING (@str,1,2) as letter
. . . . . . . . . . . .
6 SELECT startPos + 2, SUBSTRING (@str,startPos+2, 2)
You will get the consequent pairs of letters in the result:
startPos letter
----------- ------
1 A
3 lo
5 ne
7 ly
9 m
11 oo
13 n
15 a
17 bo
. . . . . . . .
If you replace lines 4 and 6 with the following lines:
4 (SELECT 1 as startPos, SUBSTRING (@str,1,2) as letter
. . . . . . . . . . . .
6 SELECT startPos + 2, SUBSTRING (@str,startPos+2, 2)
You will get the consequent pairs of letters with overlapping:
startPos letter
----------- ------
1 A
2 lo
3 on
4 ne
5 el
6 ly
7 y
. . . . . . .
The preceding two cases allow text splitting by two letters (it actually can be any number of letters) and potentially are very useful for text formatting or finding duplicates or extra spaces in the text (more on that later).