devxlogo

The Baker’s Dozen: 13 Productivity Tips for ADO.NET 2.0

The Baker’s Dozen: 13 Productivity Tips for ADO.NET 2.0

DO.NET 2.0 is faster than the first version of ADO.NET; in some instances, significantly faster. While many view ADO.NET 2.0 as more evolutionary than revolutionary, it provides many functions to give developers greater control over data access and data manipulation. It also leverages the new database capabilities in SQL Server 2005. In addition, ADO.NET 2.0 simplifies the creation of multiple-database solutions.

Beginning with the End in Mind
The ability to effectively work with data is essential in a line-of-business application. ADO.NET has a rich object model, but takes time and experience to master, especially when coming from other development environments.

While ADO.NET in Visual Studio .NET 2003 offered tremendous power, ADO.NET 2.0 in Visual Studio 2005 provides even more capabilities, along with major performance gains when inserting, filling, and merging large amounts of data. Here’s a quick rundown of what’s in store for this Baker’s Dozen installment.

  • New capabilities for the DataTable that previously were only available for the DataSet
  • Baker’s Dozen Spotlight: Maximizing the use of typed DataSets
  • The new base class Provider model and how to write generic code to leverage it
  • Performance and optimization enhancements in ADO.NET 2.0
  • Baker’s Dozen Potpourri: miscellaneous data-munging
  • Data relations and look-ups using typed DataSets
  • Asynchronous command execution
  • New capabilities in the LOAD method to create a DataSet without a DataAdapter
  • More control over the rowstate
  • ADO.NET 2.0 introduces a new provider factory class that provides a much easier way to support multiple databases by loading the desired data provider for a specific connection string.
  • The new SqlBulkCopy class
  • Multiple active result sets (MARS)
  • Connection pooling
  • Connection statistics

In addition to the Spotlight and Potpourri sections, I’ve added a new section to this series-the “Baker’s Dozen Commentary,” which appears at the end.

Tip 1: The DataTable Breaks Free
In Visual Studio .NET 2003, several methods were only available for the DataSet. One example is writing (and reading) XML data: a developer who wanted to write the contents of a data table to an XML file had to create a DataSet, add the data table to the DataSet, and use the DataSet’s WriteXml method.

Developers who wanted to utilize those methods for a single data table had to create a DataSet and add the data table just to gain access to these methods. ADO.NET 2.0 now provides several methods for the data table. Table 1 lists these methods.

Table 1: New methods for the DataTable.

Method

ReadXml

WriteXml

ReadXmlSchema

WriteXmlSchema

Merge

Load

GetDataReader


In addition, to these methods, ADO.NET 2.0 allows a developer to serialize a data table in remoting scenarios. The new RemotingFormat property (covered in detail in Tip 4) applies to the data table.

Finally, ADO.NET 2.0 allows developers to create a new data table from any data view. The ToTable method also contains an overload for the developer to create a subset list of columns in the new table. And if that isn’t enough, hold onto your hat?there’s more! ToTable contains an additional overload to filter distinct rows. (Previously, you had to write additional code to accomplish this.) Now a developer simply must specify an array of columns to filter on unique values.

I’m very pleased with the data table enhancements in ADO.NET 2.0. There are some enhancements that I hope Microsoft will consider for the next version. Even with the new independence of the data table, there are still capabilities that require a DataSet. For instance, you can only establish a data relation between two data tables in the same DataSet. Situations where master codes exist in multiple transaction tables requires some additional coding when relations need to be established.

Tip 2: Getting the Most Out of Typed Datasets
A typed DataSet is a subclass of the standard ADO.NET DataSet. It exposes the tables, rows, columns, etc., as strongly-typed properties. Visual Studio 2005 can check these strongly-typed names at compile time, as opposed to runtime. In this sense, strongly-typed DataSets are self-documenting. IntelliSense will display the names in the typed DataSet object hierarchy-exposing column names as properties in the IDE can be very helpful in a large database application. It isn’t necessary to perform the boxing/unboxing that is required when using untyped DataSets.

Typed DataSets simplify handling null values. A strongly-typed data row contains two methods to check if a column value is null, and to set a column value to null.

When used properly, typed DataSets are especially beneficial in a multi-developer environment as well as applications that make heavy use of result sets and reporting. Developers can define typed DataSets as separate project DLLs and then set references to them from other projects.

You’ll see some sample code to demonstrate the basic use of typed DataSets. In addition, developers can subclass a typed DataSet class to add validation code or other methods. Listing 1 includes a brief example that extends typed DataSets through an interface and a subclassed definition.

Recently I developed a reporting solution involving a stored procedure that returned ten tables. The scenario represented a one-many-many relationship with a large number of columns that I didn’t want to retype into the VS.NET typed DataSet designer. I wanted to take the structure of the stored procedure results, turn it into an XML schema, and use it to create a typed DataSet for the project (see Listing 2).

A complaint about typed DataSets is the default naming conventions in the typed DataSet class for DataTable/DataRow objects and methods/events. The generated names may not be consistent with preferred naming conventions. Fortunately, developers can use Typed Dataset Annotations to solve some of the common naming issues. Annotations allow developers to modify the names of elements in the typed DataSet without modifying the actual schema.

Annotations also allow you to specify a nullValue annotation-this instructs the typed DataSet class how to react when a column is DbNull. You can optionally tell the typed DataSet to return an empty string or a default replacement value. Shawn Wildermuth demonstrates typed DataSet annotations in an excellent online article (see the Recommended Reading sidebar).

Tip 3: Supporting Multiple Databases with Provider Factories
In ADO.NET 1.0, supporting multiple databases meant coding against interfaces for the ADO.NET objects (IdbConnection, IdbDataAdapter, IdbCommand, etc.), as well as writing a case statement to determine which connection object to use. ADO.NET 2.0 introduces a new provider factory class, which provides a much easier way to support multiple databases by loading the desired data provider for a specific connection string.

For line-of-business, data-driven applications, the DataSet approach provides many benefits. Tasks such as two-way data-binding, RowFilter/Sorting, XML integration, setting up relationships between data objects, preserving rowstate information, merging changes, and complex reporting operations are much easier with DataSets.

A developer can code against the new classes DbProviderFactory and DbProviderFactories, and pass the desired namespace at runtime. Table 2 displays the methods that a new DbProviderFactory object exposes. Note that the names of the providers must appear in the machine configuration file. Listing 3 demonstrates the use of provider factories.

Table 2: New methods for DbProviderFactory.

Method

CreateConnectionStringBuilder

CreateConnection

CreateCommand

CreateParameter

CreateCommandBuilder

CreateDataAdapter

CreateDataSourceEnumerator

Tip 4: Performance Enhancements in ADO.NET 2.0
Microsoft has made two significant enhancements to ADO.NET that result in better performance. First, they rewrote the indexing engine for better performance. As a result, many operations will execute more quickly, especially as the size of a DataSet grows. Any increase in performance will vary based on many application environment factors, but many should see increases by a factor of two or greater. One test published on MSDN that inserted a million rows into a DataTable took 30 minutes in Visual Studio .NET 2003 and 45 seconds in Visual Studio 2005! Lesser amounts of data will result in lower orders of magnitude; but in general, insert, update, delete, fill, and merge operations will execute more quickly in Visual Studio 2005.

Second, many developers know that passing and returning DataSets across physical boundaries carries substantial overhead and incurs performance penalties. ADO.NET 2.0 supports true binary serialization in remoting environments by providing a new RemotingFormat property, which a developer can set to SerializationFormat.Binary.

Depending on the size of the DataSet, the binary serialization may yield a result as much as six times smaller than XML serialization, resulting in faster transfer speeds and smaller required bandwidth resources. Note that this enhancement only works in remoting environments because Web services (by definition) pass XML.

Tip 5: Baker’s Dozen Potpourri: Data Munging
Roll up your sleeves! It’s time for some elbow grease. Listing 4 demonstrates some miscellaneous tasks that developers new to ADO.NET often need to accomplish:

  • Filtering rows that contain NULL values
  • Filtering on columns in a parent/child relationship
  • Defining primary keys and performing lookups
  • Adding a new row
  • Creating a filter and looping through the results
  • Handling null data with a strongly-typed datarow

Tip 6: Data Relations and Look-Ups with Typed Datasets
The following code references a typed DataSet with two tables.

   // Sample code to demonstrate use of typed DataSets   // Assumed a typed DataSet, DsOrder   // It consists of DtOrderHeader and DtOrderDetail   DsOrder oDsOrder = new DsOrder();    oDsOrder = SomeProcessToRetrieveAnOrder();   DsOrder.DtOrderHeaderDataTable oDtHeader = new    DsOrder.DtOrderHeaderDataTable();   DsOrder.DtOrderDetailDataTable oDtDetail = new    DsOrder.DtOrderDetailDataTable();   // Add an item    DsOrder.DtOrderDetailRow oDetailRow =    new DsOrder.DtOrderDetailRow();   oDetailRow.ItemFK = 100;    // Column names show in Intellisense   oDetailRow.QtyOrder = 5;   oDetailRow.UnitPrice = 5.50;   // AddRow is automatically part of the typed DataSet   oDsOrder.DtOrderDetail.AddDtOrderDetailRow(      oDetailRow);

If a developer defines a relation between the two tables in the typed DataSet, the XML Schema Definition Tool will add methods to navigate through the data without the need to specify the name of the DataRelation.

In the specific example of a master table called DtOrderHeader and a child table called DtOrderDetail, the XML Schema Definition Tool automatically creates a GetDtOrderDetailRows method (similar to GetChildRows) to the strongly typed DataRow class for DtOrderHeader, and a GetDtOrderHeaderRow method (similar to GetParentRow) to the strongly typed DataRow class for DtOrderDetail. This demonstrates yet another advantage of strongly typed DataSets?the XML Schema Definition Tool has already done some of the work.

Strongly-typed DataSets are self-documenting. IntelliSense will display the names in the typed DataSet object hierarchy. They are especially beneficial in a multi-developer environment as well as applications that make heavy use of result sets and reporting.

Additionally, if a typed data table contains a primary key, the class exposes a Find method associated with the column name. A developer can perform a Find against a table with a primary key of OrderID by using the method FindByOrderID. If the primary key is a concatenation of two columns (such as CustomerID and OrderID), the developer would use FindByCustomerIDOrderID. (As mentioned in Tip #2, developers can use Typed Dataset Annotations for alternate naming conventions.

Tip 7: Asynchronous Command Execution
ADO.NET 2.0 introduces a new asynchronous API that allows a program to continue while specific database commands execute. Previously, a developer could simulate asynchronous processing using asynchronous delegates or the ThreadPool class. However, ADO.NET 2.0 provides true asynchronous functionality.

Three methods in ADO.NET 2.0 provide asynchronous capabilities: ExecuteNonQuery, ExecuteReader, and ExecuteXMLReader. ADO.NET provides Begin and End prefix/suffix references to these methods.

   IAsyncResult oAR = command.BeginExecuteReader();   // other processing   while(oAR.IsCompleted==false) {      // do something while waiting   }   SqlDataReader r = command.EndExecuteReader(oAR);

The developer also needs to know when the operation is complete. ADO.NET 2.0 provides three models to detect when an asynchronous command has completed:

  • Callback model: the developer specifies a function to be executed when the command has executed. The begin methods contain an overload for a delegate parameter.
  • Synchronization objects (wait model): the IasyncResult object listed above contains a WaitHandle property
  • Polling: the IasyncResult object also contains an IsCompleted property that subsequent code can evaluate for completion. You can write subsequent code and test for whether IsCompleted is true.
Author’s Note: You must add async=true to the connection string for asynchronous execution to work. If you don’t plan to use asynchronous commands, Microsoft recommends setting this to false.

devx-admin

Share the Post: