Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

Fast Text Processing in SQL Server

Processing text or long strings usually reduces SQL to a prosaic procedural language. Learn a few techniques for facilitating speedy text processing in SQL.


advertisement
rocessing text or long strings in SQL has never been an easy task. SQL is a powerful language for performing fast operations on data sets, but when it comes to processing text or long strings, it's usually reduced to a prosaic procedural language. This article shows a few techniques for facilitating speedy text processing in SQL. Although demonstrated in SQL Server, you can apply the underlying ideas to any RDBMS with only small adjustments. Also, no third-party tools, extended stored procedures, or user-defined functions or objects written in any programming language other than SQL (or Transact-SQL) are used.

Using these techniques you will be able to do the following and more without any loops:

  • Determine the number of the words in the text
  • Determine the length and position of each word in the text
  • Determine the number of occurrences of a letter (pattern) and their positions in the text
  • Determine the frequency of each distinct word or letter in the text
  • Eliminate a letter's duplicates
  • Eliminate extra spaces between the words or between lines of text
  • Convert text according to a given format (e.g., define the length of lines in the text or implement more sophisticated formatting of the text)

Preliminary Transformations

SQL is a language dedicated to set-based processing. Text by nature requires one-by-one sequential processing, which is not the strongest feature of SQL. Hence, you can't expect an improvement in text processing if you don't change the layout of the text. In other words, you need to convert text into a structure that allows set-based manipulations.



Traditionally in relational databases, such structures were and continue to be tables. Therefore, to be able to process text using SQL, you need to put it into a table, where each word (or letter) will have a row value in a specific column.

The sections to follow show a few techniques that you can use for text conversion.



Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap