Choosing Between the CLR and T-SQL in SQL Server 2005

icrosoft’s integration of the .NET Framework and the CLR directly into the database engine in SQL Server 2005 is a key enhancement. The result is a major paradigm shift for database application developers and administrators, who can leverage the new functionality in many different ways. In this article, you’ll gain insight into this new programmability architecture, and get some tips on how to choose between the new features and the familiar T-SQL language when building your applications.

Why Use SQLCLR?
SQLCLR is Microsoft’s name for the integration of the CLR as a hosted runtime within SQL Server, enabling the execution of managed code inside the database. SQL Server 2005 is highly integrated with the .NET Framework, letting you create stored procedures, functions, user-defined types, and user-defined aggregates using your favorite .NET programming language. All these constructs can take advantage of large portions of the .NET Framework infrastructure, the base class library, and third-party managed libraries.

In many cases, managed code intended for execution within SQL Server will be very computation-oriented. Functions such as string parsing or scientific math are quite common in the applications that early adopters have created using SQLCLR; however, you can do only so much using number-crunching or string manipulation algorithms in isolation. At some point you have to get input or return results. If that information is relatively small and granular you can use input and output parameters or return values, but if you’re handling a large volume of information then using in-memory structures isn’t an appropriate representation/transfer mechanism?a database might be a better fit in those scenarios. If you choose to store the information in a database then SQLCLR and a data-access infrastructure are the tools you’ll need. And the tool that you’ll need for data access is ADO.NET.

Because ADO.NET “just works” inside SQLCLR, you can leverage all your existing knowledge of ADO.NET to get started. As an example, take a look at the code snippet below. This code would work equally well in a client-side application, Web application or a middle-tier component. It will also work inside SQLCLR.

What You Need
You should have an understanding of the .NET Framework and C# or VB.NET to build the samples included with this article. The tools required to build the sample project are the April CTP of SQL Server 2005 and Whidbey Beta 2.

   using(SqlConnection conn = new SqlConnection(      "server=MyServer; database=AdventureWorks; " +       "user id=MyUser;  password=MyPassword"))       {         conn.Open();         SqlCommand cmd = new SqlCommand(            "SELECT Name, GroupName FROM " +              "HumanResources.Department", conn);         SqlDataReader r = cmd.ExecuteReader();            while(r.Read())          {            // Consume the data from the reader and             // perform some operation/computation with             // it.         }      }

The preceding snippet uses the System.Data.SqlClient provider to connect to SQL Server. Note that if this code runs inside SQLCLR, it would be connecting from the SQL Server that hosts it to another instance of SQL Server, but you can also connect to other data sources. For example, you can use the System.Data.OracleClient provider to connect to an Oracle server directly from inside SQL Server. For the most part, there are no major differences using ADO.NET from within SQLCLR. However, what happens if you want to connect to the same server your code is running in to retrieve or alter data? In that scenario, you use the “context connection.”

Using the Context Connection in SQLCLR
Obviously, one of the most common data access scenarios is when you want a CLR stored procedure or function to connect to the same server on which it’s executing. One option is to create a connection using SqlClient, specifying a connection string that points to the local server, and open the connection. That works, but it creates a separate connection. As a result, you have to specify credentials for login, you’ll create a different database session, there’s a separate transaction context, the connection won’t be able see your temporary tables, etc. But remember that your stored procedure or function code is running inside SQLCLR because some application already connected to this SQL Server and executed a SQL statement to invoke it. To use that same connection, you’ll need to access the context connection.

The context connection, naturally enough, lets you execute SQL statements in the context in which your code was invoked. To obtain the context connection you simply need to use the new “context connection” connection string keyword, as shown in the example below:

   using(SqlConnection conn = new SqlConnection(      "context connection=true"))       {          conn.Open();          // Use the context connection      }
Making a connection to the same server in which your SQLCLR code is running is as simple as creating a new SqlConnection with a “context connection=true” string parameter.

That’s the actual code required?it’s that simple. This release also marks a change from the earlier architecture that included a separate SqlServer and SqlClient namespace, and is designed to simplify the coding experience and to help developers leverage as much of their syntax knowledge from ADO.NET as is possible going forward.

Choosing Between T-SQL and .NET
When choosing between T-SQL and .NET code, your primary concern should be choosing the most appropriate tool for the job. First, here’s an example where a SQLCLR function is appropriate to extend functionality in SQL Server. What if you wanted to implement a function in SQL Server to validate a string? Normally, you might use a regular expression for in a .NET application for such a validation, but in this case, you need to run this code as part of a SQL script to process and clean some data. Here’s an example of a function (see the downloadable code) deployed to SQL Server to enable the use of regular expressions:

   [Microsoft.SqlServer.Server.SqlFunction]      public static bool ValidatePhoneNumber(         string phoneNumber)      {         bool validate;               if (!Regex.IsMatch(phoneNumber,            "^1?\s*-?\s*(\d{3}|\(\s*\" +            "d{3}\s*\))\s*-?\s*\d{3}\s*-" +            "?\s*\d{4}$"))         {            validate = false;         }         else         {            validate = true;         }         return validate;      }

After deploying this function to the server using the Visual Studio integration features for SQLCLR deployment, you can access the function using a simple T-SQL statement such as:

   SELECT [AdventureWorks].[dbo].[ValidatePhoneNumber] (      '314-555-1212')

The preceding statement will evaluate to true, but if the phone number were improperly formatted it would evaluate to false. Building this kind of functionality in T-SQL would not only be tedious and error-prone, but would also be extremely difficult to debug and would inevitably suffer from performance-related issues.

Not a Panacea
But SQLCLR isn’t a panacea, and developers should not use the new features indiscriminately. Although using the new functionality may be interesting, developers will want to be careful when evaluating the features and determining whether or not to take advantage of them when designing and building new applications. Because T-SQL is specifically designed for direct data access and data manipulation, it should still be your first choice for writing code that revolves primarily around data access scenarios and set-based operations.

In addition, while T-SQL excels at data access and database management functions, it is not a fully featured programming language. For example, T-SQL does not provide support for collections, arrays, for each loops, or for organizing code into classes. Some of these features can be simulated in T-SQL, but managed code has integrated support for all of those features, and may provide a better solution. Here are a few key things to think about when evaluating .NET code versus T-SQL for your applications:

Because T-SQL is specifically designed for direct data access and data manipulation, it should still be your first choice for writing code that revolves primarily around data access scenarios and set-based operations.
  • Does the application require the data to be manipulated before the results can be consumed or displayed? If so, lean towards .NET code and SQLCLR.
  • Does the application rely on advanced set-based operations, including the ability to pivot or sort the data in different ways? If so, lean towards T-SQL.
  • Does the application require the use of extensive computation or custom algorithms as part of the result set calculations? If so, lean towards .NET code and SQLCLR.
  • Does the application require development support from DBAs or others who may not be familiar with .NET languages, or does it leverage significant investments in existing T-SQL code? If so, lean towards T-SQL.

Depending on your scenario, the additional features provided by managed code may provide a compelling reason to implement your database logic using managed code. In addition, you can take advantage of the object-oriented features of .NET languages such as C# or VB.NET and of features such as inheritance, polymorphism, encapsulation, and organizing code into classes separated into namespaces. If you’re working on a large project with a significant amount of code that should run on the server, these features can help you organize and manage the associated development and maintenance processes. Managed code is also better for calculations and complicated logic than T-SQL, and features extensive support for many complex tasks such as large string manipulations and regular expressions that T-SQL does not support directly.

At this point, you should probably restrict your use of managed code to CPU-intensive operations and procedures that feature complex logic, and to code that can benefit from the extensive set of class libraries that are part of the .NET Framework.

The Base Class Library (BCL) in the .NET Framework includes classes that provide functionality for advanced arithmetic operations, cryptography, file access and manipulation, and much more. Another benefit of managed code is type safety, which provides assurance that code accesses types in defined and permissible ways. Before executing managed code, the CLR will verify that the code is safe. The CLR can help ensure that your database code does not manipulate unmanaged memory. However, you should use common sense and make sure that you’re making good use of your existing code investments and building an application that can be supported by a broad number of people. Don’t jump to managed code simply because it is new.

Best practices dictate using T-SQL when the code will perform primarily data access with little or no procedural logic. At this point, you should probably restrict your use of managed code to CPU-intensive operations and procedures that feature complex logic, and to code that can benefit from the extensive set of class libraries that are part of the .NET Framework.

Leveraging .NET Languages in SQL Server
Another factor to consider when choosing between T-SQL and managed code is that, with managed code, you can select where you want your code to execute?on the server or the client. Both T-SQL and managed code can be run on the server, placing code and data close together, and letting you take advantage of the processing power of the server. But with the rapid expansion of computing power, many researchers now advocate placing computation as close to the data as possible. On the other hand, you may wish to avoid placing processor-intensive tasks on your database server. Because most client computers today are powerful, you may want to take advantage of this processing power by placing as much code as possible on the client. The architecture of your application will play a big factor here, as will the number and nature of your clients accessing the server. Managed code can run on a client computer, while T-SQL cannot. Using managed code will provide you with more flexibility as a result.

Extended Stored Procedures vs. CLRSQL
Using CLRSQL is not the only way to provide extended functionality to SQL Server. Developers have been able to build extended stored procedures in unmanaged languages that perform functionality not possible with T-SQL stored procedures for quite some time; however, extended stored procedures can?compromise the integrity of the SQL Server process itself, while managed code verified to be type-safe will not compromise that integrity.

Additionally, memory management, scheduling of threads and fibers, and synchronization services are deeply integrated between the managed code of the CLR and SQL Server. These resource management features provide a wealth of reliability compared to that previously available with extended stored procedures, which have direct access to memory and a require wide range of server resources. CLR integration in SQL Server provides a more secure, reliable, and scalable method to write procedures compared to extended stored procedures.

Mixed Emotions
Some people have expressed mixed emotions with respect to this new functionality, concerned that the benefits may be outweighed by the confusion to developers and a lack of conclusive support for the architectural benefits that are often cited. In addition, there are concerns about the performance of the CLR in SQL Server. While the CLR has been optimized to be hosted efficiently within SQL Server and can be configured as to whether or not it can even run inside the server process, until it is applied in the real world such concerns will remain. However, Microsoft is making it clear that best practice will be to write the stored procedures you need to perform tasks not possible in T-SQL using .NET code. With both tools available, the key for developers will be to pick the right tool for the job at hand.

Together, the.NET framework and the Common Language Runtime (CLR) form Microsoft’s preferred platform for executing applications. This platform is one of the core technologies in Windows Server 2003, and is an integral part of server infrastructure such as ASP.NET and Internet Information Services. In addition to SQL Server 2005, Microsoft has begun to integrate .NET technologies into other server applications such as Commerce Server and BizTalk Server. It’s clear that Microsoft has made a huge bet on the success of .NET.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

Recent Articles: