A Speed Freak’s Guide to Retrieving Data in ADO.NET

his article benchmarks a few different ways to access data using ADO.NET. Hopefully, after seeing the results of these tests, you will have a better idea concerning what approach to take when accessing data with your application. The intention of this article is not to write the documentation for ADO.NET. Microsoft has already written helpful documentation for ADO.NET (look in the PRC1 or newer help files). We will, however, take a brief look at the overlying structure of ADO.NET.

A Brief Primer on ADO.NET
ADO.NET provides a new architecture for accessing and manipulating data. It is entirely different than the ADO of the VB6 era. With it, we have new ways to access and manipulate data.

The Managed Provider and the DataSet provide the core functionality of ADO.NET.

1. The Managed Provider.
The Managed Provider supplies the following four classes.

  • Connection
  • Command
  • DataReader
  • DataAdapter

The connection and command classes are similar to those found in previous versions of ADO. Objects derived from the connection class are used to connect to data sources. Objects derived from the command class are used to run stored procedures, establish parameter information, modify and return data.

The DataReader class provides read-only and forward-only access to the data source. We will benchmark this object later. The final class of the Managed Provider component is the DataAdapter class. The DataAdapter is the channel through which our second component, the DataSet, connects to the Managed Provider.

2. The DataSet
The DataSet class consists of in-memory DataTables, columns, rows, constraints and relations. The DataSet provides an interface to establish relationships between tables, retrieve and update data. Perhaps one of the most important benefits of the DataSet is its ability to be synchronized with an XmlDataDocument. The DataSet provides real-time hierarchical access to the XmlDataDocument object.

Managed Providers currently come in two flavors. Each is represented by its own namespace in the .NET Framework. The first, System.Data.SQLClient provides access to SQL Server 7.0 and higher databases. The second Managed Provider is found within the System.Data.OleDb namespace, and is used to access any OleDb source. The SQLClient Provider takes advantage of Microsoft SQL Server’s wire format (TDS), and therefore should give better performance than accessing data through the OleDb provider. The performance tests done later in this article will be performed for both provider types.

If we were to create a diagram explaining the general structure of ADO.NET, it would look like this.



In the diagram above, two ways exist to retrieve data from a data source. The first is through the DataReader class in the Managed Provider component. The second is through the DataSet, which accesses the data source through the DataAdapter class of the Managed Provider.

The robust Dataset object gives the programmer the ability to perform functions such as establishing relationships between tables. The DataReader provides read-only and forward-only retrieval of data. The tests uncover what performance loss, if any, we encounter when using the dataset rather than the DataReader.

The Tests
A VB.NET application written with Release Candidate one of .NET, executes several benchmark test. Each test was run 5 times. Tests were run to retrieve 1, 10, 100, 1000 and 10,000 records. One set of tests uses the SQLClient Managed Provider against a SQL Server data source. The second set of tests use the OleDb Managed Provider against the same Sql Server database.

The test data contains the 1990 US Census summary of Last Names. The table contains 80,000+ records. Each record contains four fields. The data resides in a SQL Server 2000 database.

Benchmarks performed include the following:

SQLClient Managed Provider DataReader
DataSet/DataTable
OleDb Managed Provider DataReader
DataSet/DataTable

The Machine

Motherboard: Soyo Dragon Plus
Memory: 512MB Crucial DDR RAM
Processor: AMD XP 1.5GHZ
Disk: Seagate 40GB 7200



The Results
When returning 1 record with the SQLClient DataReader, there were seldom times that registered above zero. The data was returned in under 100 nanoseconds and therefore did not register in the tests. This brings up a cool new feature in .NET. Now we have the ability to go beyond measuring in milliseconds, or thousandths of a second, to measuring with ticks. One tick represents 100 nanoseconds or 100 billionths of a second.. Ticks are a property of the System.DateTime structure. The results of the tests conducted were converted to seconds for readability, but now because of the addition of ticks, the decimal is in the 10,000,000th column.

When retrieving 10,000 records using the OleDb Managed Provider’s DataSet, performance was slowest of all tests at 911+ seconds.

SQL Client

SQL Client

OLEDB

OLEDB

Records

DataReader

DataTable

DataReader

DataTable

1

0.0000000

0.0801152

0.0000000

0.1001440

10

0.0000000

0.1001440

0.200288

0.9012960

20

0.0000000

0.1802592

0.100144

1.9027360

30

0.0100144

0.2804032

0.100144

2.8040320

40

0.0100144

0.3705328

0.100144

3.7053280

50

0.0100144

0.4706768

0.200288

4.7067680

60

0.0100144

0.5608064

0.200288

5.6080640

70

0.0200288

0.6509360

0.200288

6.5093600

80

0.0200288

0.7410656

0.200288

7.4106560

90

0.0200288

0.8412096

0.300432

8.3119520

100

0.0300432

0.9313392

0.300432

9.3133920



SQL Client

SQL Client

OLEDB

OLEDB

Records

DataReader

DataTable

DataReader

DataTable

100

0.0400576

0.9313392

0.300432

9.6138240

200

0.0600864

1.7825630

0.600864

17.8256300

300

0.0801152

2.6638310

0.901296

26.7384500

400

0.1101584

3.5551120

1.101584

35.5511200

500

0.1402016

4.4463930

1.402016

44.4639400

600

0.1702448

5.3376750

1.802592

53.3767500

700

0.2002880

6.2289570

2.00288

62.6901400

800

0.2303312

7.1202380

2.303312

71.1022400

900

0.2703888

8.0015060

2.603744

80.0150500

1000

0.2904176

8.8927870

2.904176

88.9278700



SQL Client

SQL Client

OLEDB

OLEDB

Records

DataReader

DataTable

DataReader

DataTable

1000

0.3004320

8.9628880

3.104464

90.5301700

2000

0.6108784

18.3463800

6.00864

178.1562000

3000

0.9313392

26.7785100

9.113104

267.3845000

4000

1.2317710

35.7013400

12.21757

356.5126000

5000

1.5221890

44.6542100

15.42218

446.0414000

6000

1.8827070

53.8073700

18.4265

553.4959000

7000

2.1931540

63.5113300

21.03024

636.4151000

8000

2.4334990

72.8046900

23.93442

726.8452000

9000

2.7639740

81.7876100

27.43946

832.7975000

10000

3.0844350

91.6217400

31.14478

911.7110000



As you can see from the results, it doesn’t seem to matter whether we are retrieving 10 or 10,000 records, the performance degradation of the DataSet/DataTable is dramatic when compared to the DataReader. It is no big surprise that the DataReader is faster. What is surprising is that accessing data through the DataReader is a magnitude of 27-56 times faster than accessing data using the DataSet.

Another interesting observation is that the SQLClient Managed Provider is on average 10 times faster than the OleDB Managed Provider accessing the same data source. One could conclude that this loss in peformance would be consistent with any OleDb source.


The priority for most Web Developers is to build reliable, scaleable sites where performance is king. The DataReader is our best, if not our only, high performance alternative. If you’re a speed junky like me then the DataReader will also give you the thrill you’re looking for.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

More From DevX