A Speed Freak's Guide to Retrieving Data in ADO.NET
Web Developers love speed. Whether it's fast cars or fast code, raw power and high speed will get the most mundane developer pumped up. It's our nature. Sure, developers seek application horsepower to help the business, but underlying this pragmatism is the reality that we enjoy trying to eke out every last drop of speed from our applications. In the process, our sites run faster, our customers are happier and we are temporarily satiated...until the next need for speed arises.
If you have a need for speed and aren't sure of the best way to get data using ADO.NETthen this article is for you.
by
Craig Davis

his article benchmarks a few different ways to access data using ADO.NET. Hopefully, after seeing
the results of these tests, you will have a better idea concerning what approach to take when
accessing data with your application. The intention of this article is not to write the documentation
for ADO.NET. Microsoft has already written helpful documentation for ADO.NET (look in the PRC1 or
newer help files). We will, however, take a brief look at the overlying structure of ADO.NET.
A Brief Primer on ADO.NET
ADO.NET provides a new architecture for accessing and manipulating data. It is entirely different
than the ADO of the VB6 era. With it, we have new ways to access and manipulate data.
The
Managed Provider and the
DataSet provide the core functionality
of ADO.NET.
1. The Managed Provider.
The Managed Provider supplies the following four classes.
- Connection
- Command
- DataReader
- DataAdapter
The connection and command classes are similar to those found in previous versions of ADO. Objects
derived from the connection class are used to connect to data sources. Objects derived from the
command class are used to run stored procedures, establish parameter information, modify and return
data.
The DataReader class provides read-only and forward-only access to the data source. We will benchmark
this object later. The final class of the Managed Provider component is the DataAdapter class. The
DataAdapter is the channel through which our second component, the
DataSet, connects
to the Managed Provider.
2. The DataSet
The DataSet class consists of in-memory DataTables, columns, rows, constraints and relations. The
DataSet provides an interface to establish relationships between tables, retrieve and update data.
Perhaps one of the most important benefits of the DataSet is its ability to be synchronized with an
XmlDataDocument. The DataSet provides real-time hierarchical access to the XmlDataDocument object.
Managed Providers currently come in two flavors. Each is represented by its own namespace in the .NET
Framework. The first,
System.Data.SQLClient provides access to SQL Server 7.0 and
higher databases. The second Managed Provider is found within the
System.Data.OleDb
namespace, and is used to access any OleDb source. The SQLClient Provider takes advantage of
Microsoft SQL Server's wire format (TDS), and therefore should give better performance than accessing
data through the OleDb provider. The performance tests done later in this article will be performed
for both provider types.
If we were to create a diagram explaining the general structure of ADO.NET, it would look like this.
In the diagram above, two ways exist to retrieve data from a data source. The first is through the
DataReader class in the Managed Provider component. The second is through the DataSet, which accesses
the data source through the DataAdapter class of the Managed Provider.
The robust Dataset object gives the programmer the ability to perform functions such as establishing
relationships between tables. The DataReader provides read-only and forward-only retrieval of data.
The tests uncover what performance loss, if any, we encounter when using the dataset rather than the
DataReader.
The Tests
A VB.NET application written with Release Candidate one of .NET, executes several benchmark test.
Each test was run 5 times. Tests were run to retrieve 1, 10, 100, 1000 and 10,000 records. One set of
tests uses the SQLClient Managed Provider against a SQL Server data source. The second set of tests
use the OleDb Managed Provider against the same Sql Server database.
The test data contains the 1990 US Census summary of Last Names. The table contains 80,000+ records.
Each record contains four fields. The data resides in a SQL Server 2000 database.
Benchmarks performed include the following:
| SQLClient Managed Provider | DataReader |
| DataSet/DataTable |
| OleDb Managed Provider | DataReader |
| DataSet/DataTable |
The Machine
| Motherboard: | Soyo Dragon Plus |
| Memory: |
512MB Crucial DDR RAM
|
| Processor: | AMD XP 1.5GHZ |
| Disk: | Seagate 40GB 7200 |
The Results
When returning 1 record with the SQLClient DataReader, there were seldom times that registered above
zero. The data was returned in under 100 nanoseconds and therefore did not register in the tests.
This brings up a cool new feature in .NET. Now we have the ability to go beyond measuring in
milliseconds, or thousandths of a second, to measuring with ticks. One tick represents 100
nanoseconds or 100 billionths of a second.. Ticks are a property of the System.DateTime structure.
The results of the tests conducted were converted to seconds for readability, but now because of the
addition of ticks, the decimal is in the 10,000,000th column.
When retrieving 10,000 records using the OleDb Managed Provider's DataSet, performance was slowest of
all tests at 911+ seconds.
|
|
SQL Client |
SQL Client |
OLEDB |
OLEDB |
|
Records |
DataReader |
DataTable |
DataReader |
DataTable |
|
1 |
0.0000000 |
0.0801152 |
0.0000000 |
0.1001440 |
|
10 |
0.0000000 |
0.1001440 |
0.200288 |
0.9012960 |
|
20 |
0.0000000 |
0.1802592 |
0.100144 |
1.9027360 |
|
30 |
0.0100144 |
0.2804032 |
0.100144 |
2.8040320 |
|
40 |
0.0100144 |
0.3705328 |
0.100144 |
3.7053280 |
|
50 |
0.0100144 |
0.4706768 |
0.200288 |
4.7067680 |
|
60 |
0.0100144 |
0.5608064 |
0.200288 |
5.6080640 |
|
70 |
0.0200288 |
0.6509360 |
0.200288 |
6.5093600 |
|
80 |
0.0200288 |
0.7410656 |
0.200288 |
7.4106560 |
|
90 |
0.0200288 |
0.8412096 |
0.300432 |
8.3119520 |
|
100 |
0.0300432 |
0.9313392 |
0.300432 |
9.3133920 |
|
|
|
|
|
|
|
|
SQL Client |
SQL Client |
OLEDB |
OLEDB |
|
Records |
DataReader |
DataTable |
DataReader |
DataTable |
|
100 |
0.0400576 |
0.9313392 |
0.300432 |
9.6138240 |
|
200 |
0.0600864 |
1.7825630 |
0.600864 |
17.8256300 |
|
300 |
0.0801152 |
2.6638310 |
0.901296 |
26.7384500 |
|
400 |
0.1101584 |
3.5551120 |
1.101584 |
35.5511200 |
|
500 |
0.1402016 |
4.4463930 |
1.402016 |
44.4639400 |
|
600 |
0.1702448 |
5.3376750 |
1.802592 |
53.3767500 |
|
700 |
0.2002880 |
6.2289570 |
2.00288 |
62.6901400 |
|
800 |
0.2303312 |
7.1202380 |
2.303312 |
71.1022400 |
|
900 |
0.2703888 |
8.0015060 |
2.603744 |
80.0150500 |
|
1000 |
0.2904176 |
8.8927870 |
2.904176 |
88.9278700 |
|
|
|
|
|
|
|
|
SQL Client |
SQL Client |
OLEDB |
OLEDB |
|
Records |
DataReader |
DataTable |
DataReader |
DataTable |
|
1000 |
0.3004320 |
8.9628880 |
3.104464 |
90.5301700 |
|
2000 |
0.6108784 |
18.3463800 |
6.00864 |
178.1562000 |
|
3000 |
0.9313392 |
26.7785100 |
9.113104 |
267.3845000 |
|
4000 |
1.2317710 |
35.7013400 |
12.21757 |
356.5126000 |
|
5000 |
1.5221890 |
44.6542100 |
15.42218 |
446.0414000 |
|
6000 |
1.8827070 |
53.8073700 |
18.4265 |
553.4959000 |
|
7000 |
2.1931540 |
63.5113300 |
21.03024 |
636.4151000 |
|
8000 |
2.4334990 |
72.8046900 |
23.93442 |
726.8452000 |
|
9000 |
2.7639740 |
81.7876100 |
27.43946 |
832.7975000 |
|
10000 |
3.0844350 |
91.6217400 |
31.14478 |
911.7110000 |
As you can see from the results, it doesn't seem to matter whether we are retrieving 10 or 10,000
records, the performance degradation of the DataSet/DataTable is dramatic when compared to the
DataReader. It is no big surprise that the DataReader is faster. What is surprising is that accessing
data through the DataReader is a magnitude of 27-56 times faster than accessing data using the
DataSet.
Another interesting observation is that the SQLClient Managed Provider is on
average 10 times faster than the OleDB Managed Provider accessing the same data source. One could
conclude that this loss in peformance would be consistent with any OleDb source.
The priority for most Web Developers is to build reliable, scaleable sites where performance is king.
The DataReader is our best, if not our only, high performance alternative. If you're a speed junky
like me then the DataReader will also give you the thrill you're looking for.
Craig Davis has been a developer for 11 years, a Web developer for 6 of those years and is currently the Web Development Manager at Crucial Technologya Division of Micron. Craig can be contacted via e-mail at
craig@chilibean.com.